Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:119564
MIME-Version: 1.0
References: <e352423f-b740-07c9-2c4a-996112e17bbe@cubiclesoft.com>
 <92c4514f-70e3-75c9-7084-9e29641e25e7@gmail.com> <7e86a2d2-b971-592c-64e3-e86c13b5be80@cubiclesoft.com>
 <E963ED74-0404-4A5D-9811-8D1E662F764A@gmail.com> <84204896-F9CE-4186-8A72-573A0B46FC1D@gmail.com>
 <CAM9Wwz7Si98GDoJHaUKoJtOWt_UzzkjacohP4Z0XdRJsMnOPgg@mail.gmail.com> <68DBCD9C-849A-4840-9437-AE59F90A8B9C@php.net>
In-Reply-To: <68DBCD9C-849A-4840-9437-AE59F90A8B9C@php.net>
Date: Thu, 16 Feb 2023 14:37:19 +0100
Message-ID: <CAM9Wwz79raER=ovZ7V85rjf1ZSLAx3yqdBvDmHpccHtAULr6rQ@mail.gmail.com>
To: internals@lists.php.net
Content-Type: multipart/alternative; boundary="0000000000005ae6d605f4d14dc7"
Subject: Re: [PHP-DEV] [RFC] Working With Substrings
From: flexjoly@gmail.com (Lydia de Jongh)

--0000000000005ae6d605f4d14dc7
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Derick, Thomas,


Op do 16 feb. 2023 om 08:57 schreef Derick Rethans <derick@php.net>:

>
>
> https://wiki.php.net/rfc/unicode_text_processing
>
> And yes, that won't be as fast as just calling strtoupper.
>
> cheers
> Derick
>

Looks great!!!

Complex string manipulation inside an object will be faster then all
copying variables around in memory,
like Thomas kindly explained in his post. If I understand correctly....

And it would make php even more mature, gaining from more OOP.



Op wo 15 feb. 2023 om 20:35 schreef Thomas Hruska <thruska@cubiclesoft.com>=
:

> <......>

Doing that operation one time is fast enough and not really a problem.
> Doing it 1,000,000 times in a loop is where we end up constantly copying
> memory around when we could potentially work on the same memory buffer
> the entire time.  We still might end up using the same memory buffers
> over and over due to recycling them through the PHP memory pool, which
> means the buffers might get to sit in the L1 or L2 cache in the CPU, but
> it does leave some performance on the table because copying a buffer or
> portions of it repeatedly can be an unnecessary operation.  Buffers that
> are larger than the CPU's cache line sizes are going to suffer the most
> because there will be constant requests to main memory for the
> information that the CPU needs to modify and will constantly flush the
> cache lines and stall out while waiting for more data to arrive.  That's
> not exactly optimal/ideal.  Modifying the same buffer inline will be
> more likely stay in the L1 and L2 cache lines and therefore be much
> closer to the CPU core, resulting in notably faster performance.
> Pointers in C are much faster than copying memory.  The problem is
> exposing pointers to userland, especially in Internet-facing software.
> Pointers are notoriously unsafe - just look at the zillion buffer
> overflow vulnerabilities (CVEs) that are reported annually across all
> software products.  Copy-on-write, by comparison, is a much safer
> operation at the cost of performance.  However, pointers let us just
> point at a substring or general chunk of memory instead of copying it,
> which significantly reduces the overhead since pointers are simple
> integer values that contain a memory address.  And those values are
> small enough to sit in CPU registers, which are blazing fast.  CPUs only
> have a handful of registers though because each register dramatically
> increases the cost of the CPU die.  So if we can just point at the

memory we want to "extract" instead of actually copying the data into
> its own string object, we can potentially save a ton of CPU cycles,
> especially when working with data inside a loop.
>
>
> Overall, I think substrings offer the most obvious/apparent area for
> performance gains and probably have, implementation details aside, the
> least amount of friction.  But maybe we should consider the larger
> ecosystem of string functions as well?  Or should this just be a
> possible longer term idea that requires more thought and research and
> thus the scope should be limited and we put Lydia's idea under Future
> Scope in the RFC?  Other thoughts/comments?
>
> Added as Open Issue 10 to the RFC.  Thank you for your input.
>
> Thomas Hruska
>

Thanks for your kind and extended explanation.
I know a little about the memory allocations.

But I am not sure about what to conclude from your explanation. If an
object would take less copying around or not.

This memory conversation brings up other old memories =E2=98=BA... peek, po=
ok,
assembly etc =F0=9F=98=8D

Greetz, flexJoly (aka Lydia)

--0000000000005ae6d605f4d14dc7--