Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:123925 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 956F01A009C for ; Thu, 27 Jun 2024 05:31:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1719466385; bh=Iw8YXY+AFCZfx4xFtxHrEwyYmFUaijiyzza0sCFotFY=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=QLdZ5sHbIoIEDJoOuO4uktZkCVGq1lr75blqtnRjyDd2zjWmaCGrknZAnuguSOCGG mNVPRzB7p7M9kAnp3vGt+HDb1NOCWXM6nBnYXNl0Y/gqOl45M+yEuExKxXhX94uYBS pMn85+qbmhfpcWIHQAmbJCI0icUfx+mn/aCWdT5MWTbNDf75tB9goXfPsu8gOoL/om fP85tXr96Qi30FtcC+bs1StW/kGXAjX2eQ09KnvStJAvGCLIuyjCEWh4ApoMwsMx3L miRin9Cwn8DzrYGMHqN1fM0bOIPyr4G54Hbz+ibpg84Pey2rRAkDUr/0k4NgXhudp3 Sdg07kvTDGOBA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 564F0180677 for ; Thu, 27 Jun 2024 05:33:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DMARC_MISSING,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 27 Jun 2024 05:33:03 +0000 (UTC) Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-643efaf0786so42963717b3.1 for ; Wed, 26 Jun 2024 22:31:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20230601.gappssmtp.com; s=20230601; t=1719466305; x=1720071105; darn=lists.php.net; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=G+20+/C6R/idKJgT2O0L6J0Vhvtt2t/UEfJmVGPvJJw=; b=TCLiQnnZjarFdzzZn2dh1vA6GSzcaEWqMwVyAiaz9kSk6OBsr2hvTMFNJC/IQwW7d4 J9e36X9Y8hdf8lQMjlWey19/ozYcecKaLOEkmWaK3ReymDMMm3PTohhP1zSDGFMGVCdw QDbX2C3N4dCyjG2b0N6mVMxNpo5IyOkbfL9XgpadAqvIvyJhkUHKgS3Ab3ub44uba+Ca wNXuw++hC1iRqTnaWGdVsUtZwP6qtuJ9TdtQOt8o1eqgsrzja5n+nugktaeELarIMKMS 2sE1nOjNYiDxTF96Og8npV+rjWzzHXXKaJbFfw4qKgfEz2KrgdwidXeQinKkOwNW5SgA nnQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719466305; x=1720071105; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=G+20+/C6R/idKJgT2O0L6J0Vhvtt2t/UEfJmVGPvJJw=; b=owsaqMGIBwxfDyCVTQyHmmj5wCYkojSkqBRB0d1PynoTEUiOL9/wurKv4wImcEu6GG 64NdHFLT3WRI93xkzKcziGieHDlxe0tZG/+Rt8ucKIxnysVCkZUvGlWORup0lUCKG4rX i869kiceaSKuuXjIj1gI7eG5ZjVldwdwhL3HWJseT0FB0HLrIETROz6D/jqtO9kPtBA6 FDLNhgsevRgQlGzEOMhp0zZ2AiCTufvSEWQsDALBI5hDIDhkBmgUVGft4dqG2D/kFN4Z k1e6xRt/zKXYJG4Go21LC+2AYoxpnorxTVmL0hXYWLxhReXr/2fgMqLs619t8YsSduog ww+w== X-Gm-Message-State: AOJu0Yw7PMzz0fvThqfrkuDZXNgwqXIasbizgBZwEsEupq/JLdUSOrp8 ex9MpaN5e6+YCr/sqkLX9LSe5EJrkLmL6b1awdzmDqWhu3xx3wcH5+eCVukS8rKD5FNVeViJCoS aU5s= X-Google-Smtp-Source: AGHT+IEtVENwq7f1TPFa/MfiplocAEgz9O5JSzJIOCQ1qEGFQRsQxuogGtwPtCKt5n4j4nokmf8hOA== X-Received: by 2002:a05:690c:1b:b0:62d:1eb6:87bf with SMTP id 00721157ae682-6433eaf087amr141954277b3.5.1719466304896; Wed, 26 Jun 2024 22:31:44 -0700 (PDT) Received: from smtpclient.apple (c-98-252-216-111.hsd1.ga.comcast.net. [98.252.216.111]) by smtp.gmail.com with ESMTPSA id 00721157ae682-64978f3fc72sm1353817b3.32.2024.06.26.22.31.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 26 Jun 2024 22:31:44 -0700 (PDT) Message-ID: <0BBF41AF-2516-44D4-A102-73580C5ED373@newclarity.net> Content-Type: multipart/alternative; boundary="Apple-Mail=_24273552-733C-4528-866E-D40BBBD91BD1" Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.8\)) Subject: Re: [PHP-DEV] [RFC] Deprecations for PHP 8.4 Date: Thu, 27 Jun 2024 01:31:44 -0400 In-Reply-To: Cc: PHP internals To: "Gina P. Banyard" References: X-Mailer: Apple Mail (2.3696.120.41.1.8) From: mike@newclarity.net (Mike Schinkel) --Apple-Mail=_24273552-733C-4528-866E-D40BBBD91BD1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jun 26, 2024, at 8:14 AM, Gina P. Banyard = wrote: >=20 >=20 > On Wednesday, 26 June 2024 at 06:18, Mike Schinkel = wrote: >> https://3v4l.org/RDYFs#v8.3.8 >>=20 >> Note those seven use-cases are found in around the first 25 results = when searching GitHub for "strtok(". I could probably find more if I = kept looking: >>=20 >> https://github.com/search?q=3Dstrtok%28+language%3APHP+&type=3Dcode = >>=20 >> Regarding explode($delimiter, $str)[0] =E2=80=94 unless it is to be = special-cased during compilation =E2=80=94it is a really inefficient way = to find the substring up to the first character, especially for large = strings and/or when in a tight loop where the explode is contained in a = called function >=20 > Then use a regex: https://3v4l.org/SGWL5 Using `preg_match()` instead of `strtok()` to process the ~4k file of = commas is, on average, same as using explode()[0], or 10x as long as = using `strtok()` (at times it got as low as 4.4x, but that was rare): https://onlinephp.io/c/e1fad Size of file: 3972 Number of commas: 359 Time taken for strtok: 0.003 seconds Time taken for regex: 0.0307 seconds Times strtok() faster: 10.25 > Or a combination of strpos and substr. Using `strpos()`+ `substr()` instead of `strtok()` to process the ~4k = file of commas is, took on average ~3x as long as using `strtok()`. I = implemented a class for this and tried to optimize it by using only = string positions and not copying the string repeatedly. It also took = about 1/2 hour to get the code working vs. about 15 seconds to get the = code working with strtok(); which will most programmers prefer? https://onlinephp.io/c/2a09f Size of file: 3972 Number of commas: 359 Time for strtok: 0.0027 seconds Time for strpos/substr: 0.0089 seconds Times strtok() faster: 3.31 > There are *plenty* of solutions to the specific problem you pose here, = and thus many different solutions more or less appropriate. Yes, and in all cases the existing solutions are significantly slower, = except one. And that one solution that is not significantly slower is to not = deprecate `strtok()`. Not to mention not deprecating would keep from = causing lots of BC breakage. -Mike= --Apple-Mail=_24273552-733C-4528-866E-D40BBBD91BD1 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On = Jun 26, 2024, at 8:14 AM, Gina P. Banyard <internals@gpb.moe> = wrote:


=20
=20
On Wednesday, 26 June 2024 at 06:18, Mike Schinkel = <mike@newclarity.net> wrote:
Note those seven = use-cases are found in around the first 25 results when searching GitHub = for "strtok(".  I could probably find more if I kept looking:


Regarding explode($delimiter, $str)[0] =E2=80=94 unless it is = to be special-cased during compilation =E2=80=94it is a really = inefficient way to find the substring up to the first character, = especially for large strings and/or when in a tight loop where the = explode is contained in a called function

Then = use a regex: https://3v4l.org/SGWL5

Using `preg_match()` instead = of `strtok()` to process the ~4k file of commas is, on average, same as = using explode()[0], or 10x as long as using `strtok()` = (at times it got as low as 4.4x, but that was = rare):


Size of file:     =     =  3972
Number of commas: =      359
Time = taken for strtok: 0.003 = seconds
Time taken for regex: =  0.0307 seconds
Times strtok() faster: = 10.25

Or a combination of strpos and substr.

Using = `strpos()`+ `substr()` instead of `strtok()` to process the ~4k file of = commas is, took on average ~3x as long as using `strtok()`. I = implemented a class for this and tried to optimize it by using only = string positions and not copying the string repeatedly. It also took = about 1/2 hour to get the code working vs. about 15 seconds to get the = code working with strtok(); which will most programmers = prefer?


Size of file:       =     3972
Number of commas: =       359
Time for = strtok:        0.0027 = seconds
Time for strpos/substr: 0.0089 = seconds
Times strtok() = faster:  3.31


There are *plenty* of solutions to the specific problem you = pose here, and thus many different solutions more or less = appropriate.

Yes, and in all cases the = existing solutions are significantly slower, except = one.

And that = one solution that is not significantly slower is to not deprecate `strtok()`.  Not to mention not = deprecating would keep from causing lots of BC = breakage.

-Mike
= --Apple-Mail=_24273552-733C-4528-866E-D40BBBD91BD1--