Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:119552 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 91072 invoked from network); 14 Feb 2023 21:02:45 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Feb 2023 21:02:45 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0C7941801FD for ; Tue, 14 Feb 2023 13:02:45 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 14 Feb 2023 13:02:44 -0800 (PST) Received: by mail-wr1-f47.google.com with SMTP id m10so8137464wrn.4 for ; Tue, 14 Feb 2023 13:02:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=nYjehCxLw0yONwoY1RR7jTynv7nMiiiNpkVtvFIw8VQ=; b=ge2EEaFuZiNm+RUWhNw3oR3cCqpcfFAbFXXmLjp6CvHbS/AFR9g1GtwgpM8eERPZ7e tGLwvlYysHa7zV1T5upVxpzCDGpHlK6ADS2kEpqZfLVML44yrf+CTB1f6EJxXqDd0jrI CJyr2i+UJFOOUi5f/LefO1k/RF4C1RAiV5vKfPorWouzxQBl6PH11QHRajcMchYJ1pM5 9toFWNQAZkA+C0HINY6vsmifuKjsJl96F921e+yZOXoww96cuwjCCkQVaxBOrJ1APNf/ 3cbKQYEr44ytYJi3cf6E2WOzQlwLevjl7+tWIjbfxPcKkMkFM+TuFEKgXoff2ehcUbE9 +bnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nYjehCxLw0yONwoY1RR7jTynv7nMiiiNpkVtvFIw8VQ=; b=FMgLYW0UGGzlNzGX77sgM4q5hQWmZR55b9ON7j7ZU0bfYskbBWV6TftVGFbYoBV+eN 4Vf9BDZv5QXi5gr83PxTNzToDoxakCrHY6J9YNeGU60xLFDk5T8VsuZvIv3etwIZSWUr CtxRpqCLsjGPui6APqVT66Uq5+sKBE3ILdAHbmIcPmKahm7ywolZTpyJvrM2rtWT7Pua WzPLeQGhXgZ/q3HGdxogMlIGBNSr75V+kPx6r9BlkMpoj5Yvm9OoXiEifWh0CCY0BTaf 6zA9vyGLGhPvXgXDzlFccuKjSmM815F0KNyk2gXks0hLNfaWuOur6AaPlx0iVqUviLvI LWKA== X-Gm-Message-State: AO0yUKV04ZQR3NSd2G+64HDgZjAK9oF4HREoo/0vcPl4vImkcc3ZEDWK XYUA5aByPrH5+vXH8AsZmgrnxV0dIow= X-Google-Smtp-Source: AK7set+CJwUIdpXUWSOny76EHvzQ3WfYeP3Dw6As21aQhveCnC0/Fqbx79n5xazpCs0GF9PS/nzaFw== X-Received: by 2002:adf:f545:0:b0:2c5:4d8f:ff11 with SMTP id j5-20020adff545000000b002c54d8fff11mr2866759wrp.61.1676408563303; Tue, 14 Feb 2023 13:02:43 -0800 (PST) Received: from [192.168.0.22] (cpc83311-brig21-2-0-cust191.3-3.cable.virginm.net. [86.20.40.192]) by smtp.googlemail.com with ESMTPSA id h17-20020a05600c2cb100b003dc433bb5e1sm86wmc.9.2023.02.14.13.02.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 14 Feb 2023 13:02:42 -0800 (PST) Message-ID: <92c4514f-70e3-75c9-7084-9e29641e25e7@gmail.com> Date: Tue, 14 Feb 2023 21:02:39 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 To: internals@lists.php.net References: Content-Language: en-GB In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] [RFC] Working With Substrings From: rowan.collins@gmail.com (Rowan Tommins) On 14/02/2023 15:32, Thomas Hruska wrote: > Hello Internals, > > I would like to start the discussion on adding several functions and > parameters to existing functions for improved substring handling in PHP: > > https://wiki.php.net/rfc/working_with_substrings Hi Thomas, Thanks for your effort on this, I think efficient string handling functions would be a major help for the ecosystem, allowing library authors to do things in plain PHP code which currently defer to C extensions just for performance. My first thought opening the RFC was to see a function signature with 9 arguments and immediately wonder how to refactor it into something more manageable. Just writing *tests* for all the combinations sounds like a nightmare, let alone understanding code that uses them all. As I read through, I had a similar feeling about the need to copy-and-paste the same two parameters onto so many functions. Luckily, I think the RFC contains the seed of the solution to both problems: what you refer to as "virtual buffers". These seem to be crying out to be a new data type, with their own API - probably using OO style, given general fashions. Framed around that, I think we can split out a few different concerns: * Methods to take a string, and make a new, writeable buffer pointing at all or part of it * Methods to access parts of a buffer, as a string or another buffer * Methods to efficiently write to, delete from, or overwrite, parts of a buffer * Methods to explicitly manage the memory used by the buffer * Finally, support for writing to, or reading from, a buffer instead of a string in a number of existing functions Thinking about exactly what those methods should look like leads me to my next thought: we should be learning from prior art here. Are there other languages which already do this well, which PHP could emulate? Are there other languages which already do this *badly*, whose mistakes PHP could explicitly learn from? What comes to my mind immediately is that both Java and C# have "StringBuilder" classes, which cover at least some of these use cases. C#, in particular, had a lot of very smart people paid to design it, able to learn from mistakes Java had already made. Regards, -- Rowan Tommins [IMSoP]