Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:112917 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 76760 invoked from network); 18 Jan 2021 10:53:14 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 18 Jan 2021 10:53:14 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7292B1804DA for ; Mon, 18 Jan 2021 02:32:40 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 18 Jan 2021 02:32:39 -0800 (PST) Received: by mail-lf1-f44.google.com with SMTP id h205so23294952lfd.5 for ; Mon, 18 Jan 2021 02:32:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QAXQyMxsG4VBz+NPZKe6zv6qLPLrnY+HsdSZhbeehNQ=; b=hT9LOhnCKsCvdMx9mpCLdsl3G8CEx/+693M6hxQaUcD2YsGKk3b6mcwhM6QfAVJkui XNI4d3mVJEkntCE6+XxHf+x1su90qx4p539TOOCVqIiHihVGyNM1vdC5+JklBjaaDriQ psCYpD3xtH71+0zavNKJmvXlU3p65X6yiAWrPhhi4VBE2z4ZPXOTPcRaZxHM8QlwZaZY MeYutpN/x1RtKBiRkgZI25yiugd8u7hWVPZrVnVrCPeFfoz57JLjFcP27PJxD/k+sNux NrdpJcmf8E39G6b5ENHDC43LSwobBqX3Xm10jcPPa/yL2pFBi2b+w9eaGX5VxyweO85n Ibww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QAXQyMxsG4VBz+NPZKe6zv6qLPLrnY+HsdSZhbeehNQ=; b=XIyX43AzZkCyRAqffO99jMBMFTNAB6QBRzX2z8ahTdmkH1tVSB4fPKBTqXb77xDjgi rLUrwRsagS2grhR0fmL+4XQXhFHkWfO2/TFM/EGUq1OO25q1j0KPmxwZ29tLmP8VF+0y Q/jVyyTZ97ULlqJXRWmSEZoPlXYN3Erp9kD9uSplGpSAK91ch61JR5aiv2yIxHc9ky2y bSJ3PzFwFpGWR4tNzfnDg0ZQnheW91PAaQGKeILl/E7Y6nlYioMFchuPXpEDgq+pJzcG ytDgRb8QwfcETlvckqJ5SFDqhRXUVGLLmQ8ZQWsNra8a4EWP/tgT8RnYY+1qZsvM2jyp 1lDQ== X-Gm-Message-State: AOAM533BflrDDgT1C1MZm8erTLxTQNVmfhT9F7GhS5/Aftb2PkBXCTA0 9ou6KN9NVNQ7CYTO1SxMdZt4uP+0vLvK1GXTrNc= X-Google-Smtp-Source: ABdhPJwR4Z4nIjImz+dIzxNUSJOHCPvDTKZOYq4Eah1jejzeO2TQH6iGD9fTy5RqktZHBeZRINzw5BFFqd2IOGUkG7w= X-Received: by 2002:a19:814c:: with SMTP id c73mr11092148lfd.638.1610965956793; Mon, 18 Jan 2021 02:32:36 -0800 (PST) MIME-Version: 1.0 References: <046401d6e999$f4de5d50$de9b17f0$@adsar.co.uk> In-Reply-To: <046401d6e999$f4de5d50$de9b17f0$@adsar.co.uk> Date: Mon, 18 Jan 2021 11:32:20 +0100 Message-ID: To: Adam Cable Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000099997605b92a3de3" Subject: Re: [PHP-DEV] Addition of substring and subistring functions. From: nikita.ppv@gmail.com (Nikita Popov) --00000000000099997605b92a3de3 Content-Type: text/plain; charset="UTF-8" On Wed, Jan 13, 2021 at 11:51 AM Adam Cable wrote: > Hi internals. > > I've been coding in PHP for 15 years now, and spend most days using it to > transform content into meaningful data. > > Most of this is pulling out prices and attributes of certain products from > HTML, for example, grabbing the price from content such as "Total price > including delivery: £15.00". > > To grab the "15.00" from the string can take quite a few lines of PHP and > can be pretty cumbersome. > > I've built some helper functions - substring, and it's case-insensitive > variant subistring to help. > Functions take in the string plus a to and from string, and return a trim'd > string found between the two. > > So substring("Total price including delivery: £15.00", "£", "") > would return "15.00". > From and to strings are optional and therefore return from the beginning or > to the end. > > In the past I hadn't thought about adding this to PHP core, but with the > introduction of str_starts/ends_with functions in PHP 8.0 I thought it may > be useful to include the sub(i)string building blocks too. > > Implementation and tests can be found @ > https://github.com/php/php-src/pull/6602 > > I'm sure the C implementation can be made a lot better, but it seems to > work > OK at present. > > This is my first e-mail to internals, so please excuse my naivety with > things, but hope this is useful. > > Thanks, > Adam > Hi Adam, A few thoughts: 1. The name of the function is not clear. It's not obvious what the difference between substr() and substring() is. To make it worse, JavaScript has both substr() and substring(), but the meaning of substring() there is a different one from what you propose. I think a better name would be something like str_between(). 2. Why does this perform an implicit trim() call? I understand that this may be useful in some cases, but it will also limit applicability of the function. It's easy to write trim(substring(...)), but if the trim() is part of the call, there's no way to avoid it. 3. More generally, I feel that this API is a bit too specific for inclusion in the standard library. It can certainly be useful, but I don't think it's anywhere near as ubiquitous as operations like str_ends_with(). For complex string matching tasks, I would probably pick preg_match() over a combination of str* functions anyway. Regards, Nikita --00000000000099997605b92a3de3--