Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:123931 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 5B0041A009D for ; Thu, 27 Jun 2024 07:45:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1719474381; bh=M20WkJjJF0Q19mzxcNYtZ3K4ox6XttIz+0hkv2IHSPA=; h=Date:Subject:To:References:From:In-Reply-To:From; b=komFlg6ZYJBtmc7BiDdu42vd2vOFAcEihVrTyJZgeDWdg5N8o0/0t87VE4tLult52 G+bYTZH8HaQlpOqxJPJ3Z2g3kmitPXe9cLcTm8nc7Ee6GLphzS0GpGrsPnPYhbdFuO u/nINZnHpEJlbKQGVZPWmQgknkI+MTase/eSloxi876LAt9SA+4cKHhQ4eOg5pPgxd KwSL1+HPEa0FQ0zhKfsjXfewafIETxEUavU2Y58p78qFBrcMgfXQBMLeqyrVma9W2s oIysgXvJpTT16KD3PybcjKuydnb8/ECnO1flF1wVOcU4Q0kPQpofSAjG4Jp2dicbFs f82WSPfiU2MQQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 25AAC180616 for ; Thu, 27 Jun 2024 07:46:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: ** X-Spam-Status: No, score=2.1 required=5.0 tests=BAYES_50,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 27 Jun 2024 07:46:16 +0000 (UTC) Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-a6cb130027aso505848066b.2 for ; Thu, 27 Jun 2024 00:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719474297; x=1720079097; darn=lists.php.net; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=sTgFYhp38O4KrZoU7tx5NvpRTY0H1KC425MwxOOqNc0=; b=iDcSHGmhDK/ItAM15CPPy9rg+gfS9CuL4v7fmi4gQTcadR2TyqsJ7gzV8coI6+apRJ Ddhnv/Wj/KukEsvIFdjoCPNzdeJClMN1Csai6guOstjHu89r8w1HR1/uw++6NbCsSBfO UxtAg0A4AP7nejYDEJHD9PRSKWkQk/YHqUVMAX3rOlXGn9AihWXf6W/FEAl0T4J2kJ3Z lDUY0t/sKddhFRMZIME/4H18gZxF4dOKcSSLkqEY6AseC0hrSHX7cEtb4fjL9cuzj2yE XcEgy6d/5CLOXyZ9GT0FwN4AvaeHCvQA4sjXACB1V2Hi7Nmk+YGSbtG08Vzsfx1TztJ9 38gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719474297; x=1720079097; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sTgFYhp38O4KrZoU7tx5NvpRTY0H1KC425MwxOOqNc0=; b=u7B8CVtWHraTxRTb/m5AIsy1lZBFLZvFfUHOwJ5CNhBw4/MbB27qWpLbB6DNV9GcEa vhdGTJs9YHC+xWqZ/0CIzPgsyKfGfkZ9m38crpg5ZBhqh96j1dtZyPPaBUhwbBXs7ksb UOMKaePC2ul/QUIBZuduwfreIo8MzDA8lR8a+ZoM35tu7zNHBf3/EI+Uh5T6BqOH9Mii PF1Ur07ZjYvhR3ghvjYqISIW1z/xMUYxLv76w4DgA+QCXkOYc0nhf2l0RsKLoz/vKrl2 SSHNoQ15c1mjY5W5hl6PlvDhTIpA7GKH0iXETzSwa2SCLhUXs1LSn/JYcX8sopaDSz7C mdxA== X-Gm-Message-State: AOJu0YxBoKi0zaEo14ws3QDWbYcl0WNwODiHtYg5Y97MsHzDHl26nzz/ 7fBKBQ6gDUfN5sgW60dlZQg5VJYQoQRjLrNmEp1QFSkbgeQxZ1MlqBC1jw== X-Google-Smtp-Source: AGHT+IFQMuGHRdZlYKnfaLblsjMP16P73TYVXcCukwigwU+O50VbYnoeyUEvyFgxCm25kRFMSlP59g== X-Received: by 2002:a50:aade:0:b0:57d:50c:e28d with SMTP id 4fb4d7f45d1cf-57d4bd72312mr10854467a12.10.1719474296770; Thu, 27 Jun 2024 00:44:56 -0700 (PDT) Received: from [192.168.156.171] ([31.12.3.154]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-584d2781199sm521013a12.77.2024.06.27.00.44.56 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 27 Jun 2024 00:44:56 -0700 (PDT) Message-ID: Date: Thu, 27 Jun 2024 09:44:55 +0200 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PHP-DEV] [RFC] Deprecations for PHP 8.4 To: internals@lists.php.net References: Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: markus.podar@gmail.com (Markus Podar) Hi, On 26.06.24 07:18, Mike Schinkel wrote: >> On Jun 25, 2024, at 4:51 PM, Gina P. Banyard > > wrote: >> >> >> On Tuesday, 25 June 2024 at 19:06, Mike Schinkel > > wrote: >>> >>> strtok() >>> ===== >>> strtok() is found 35k times in GitHub: >>> >>> https://github.com/search?q=strtok%28+language%3APHP+&type=code >>> >>> >>> It is a commonly used as a "left part of string up to a character" in >>> addition to its intended use for tokenizing. >>> >>> I would prefer not deprecated because of BC breakage, but IF it is >>> deprecated I would suggest adding a one-for-one replacement function >>> for the  "left part of string up to a character" use-case; maybe >>> `str_left("abc.txt",".")` returning `"abc"`. >> >> For this exact case of extracting a file name without an extension, >> you should really just use: >> |pathinfo($filepath, PATHINFO_FILENAME);| >> But for something more generic, you can just do: >> explode($delimiter, $str)[0]; >> >> So I really don't see why we would need an "str_left()" function. > > Ah, *the dangers of providing a specific example of a broader use-case* > is that someone will invariably discredit the specific example instead > of focusing on the applicability for the broader use-case. 🤦‍♂️ > > To wit, here are seven (7) use-cases for which `pathinfo()` is not a > viable alternative: > > https://3v4l.org/RDYFs#v8.3.8 > > Note those seven use-cases are found in around the first 25 results when > searching GitHub for "strtok(".  I could probably find more if I kept > looking: > > https://github.com/search?q=strtok%28+language%3APHP+&type=code > > > > Regarding explode($delimiter, $str)[0] — unless it is to be > special-cased during compilation —it is a really inefficient way to find > the substring up to the first character, especially for large strings > and/or when in a tight loop where the explode is contained in a called > function. > > Here is a benchmark (https://onlinephp.io/c/87341 > ) showing that — on average of the runs I > performed — for using `strtok()` to fully process through a 3972 byte > file with 359 commas it took right at */90 times/* longer using > explode($delimiter, $str)[0] vs. strtok($str,$delimiter). Imagine is the > file were 39,720 bytes, or larger, instead. > > Size of file:                3972 > Number of commas:            359 > Time taken for strtok:       0.0034 seconds > Time taken for explode:      0.3036 seconds > *Times strtok() faster:     89.1* > > > Yes the above processes the entire file using explode()[0] each time > rather than first using explode(",") once — because of the equivalent of > the N+1 problem[1] where the explode() is buried in a function. This > illustrates why strtok() is so good for its primary use-case of parsing > text files. strtok() is fast and does not use heaps of memory on every > token. > > This leads me to think `strtok()` */should not/* be deprecated given how > inefficient string handling in PHP can otherwise be, at least not > without a much more efficient object for string parsing. I'm with Mike on `strtok()` and don't understand why it would be on a deprecation list. I see nothing inherently "wrong" or "dangerous" with it: it's one of the "works an intended" and if you know how to use it, it works perfectly they way it is designed. The variations of suggestions in other replies how to handle certain use cases of `strtok()` already shows there's no clear migration path and depends on the situation, which is the worst. Compare this with suggestion like `sha1()` or similar, where the deprecation is about "the function, but not the functionality", because SHA1 is available by other means. But there's no clear alternative to `strtok()`, as it is its own kind. 👎 on deprecating it; if a gotcha with it is not clear (e.g. using it in different scopes, as this was brought up), I see this rather as a "documentation problem". cheers, - Markus