Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126192 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 4EB721A00BD for ; Tue, 31 Dec 2024 15:55:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1735660328; bh=uMD7WfWH33iCUtv8Jed14DynslEd/OGuALcz1FzhBPM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=UUNtoP8/TVnxQh1rgJ7l8iPaL0kwdZ9pmC3qHlyg6Sj1DpyxaCiYX6O787Ee+TVXh iJy6xP0vGQKCqZYr1Y28Jnp1zJYv7ebp8CePEzdyybnT9VOtHw4wUc5WHzsTeBtt71 5a5qafSk1csNOKLGNJTgoQxjuEma4KW+zY4kWuyBHvaf+Dm/zYkzkNyS+C5DjTFHgG KdpDCo80+n7tSmqoiMR7owCsPmRu1tHDjk5yLU86XJcTQ2GwEjEm55hcttA39jYUtu Q1yHyX2BZZ2kGs/RCX/oEAhk+g7WxqQqOAS6i15z8sDdHHAN6/Mw5qgx1ABziP3Zez yfkGEX1rak4NQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8ACB218004B for ; Tue, 31 Dec 2024 15:52:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from venus.thgnet.it (venus.thgnet.it [159.69.22.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 31 Dec 2024 15:52:05 +0000 (UTC) Received: from mail-oa1-f54.google.com (mail-oa1-f54.google.com [209.85.160.54]) by venus.thgnet.it (Postfix) with ESMTPSA id 3545A3ED9D for ; Tue, 31 Dec 2024 16:55:02 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=giacobbi.net; s=venus; t=1735660502; bh=2i674twRDUdtx9f4RurRGF9YzFPxlsGxkwh0Qj7C+UM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=FzdDVm0MIE4US3eivqGfsvPrfEFTQPFwz32o0zug5s63wGqrYM6ZP2GsgTZUMAvC5 y5+IJrybeeHOk05uLra03XwKnjmnHgXbbxgdQiRFVvKw4ixATNziH9JoZ0JYp1JKud NKBE9kjdPcegJJehVITSL68uarCQTJbeDasaNZyk= Received: by mail-oa1-f54.google.com with SMTP id 586e51a60fabf-2a0206590a7so4724396fac.0 for ; Tue, 31 Dec 2024 07:55:02 -0800 (PST) X-Gm-Message-State: AOJu0Yxee82W2F0twPXtKEK2M6scnKGT50CYkXQ9tWQ+jgNNBMAtohit LqL3IYcXjb9U4AlddxegaJaCziYvCZzSJDYVZZKCD5ilZkybO51qlPyS63QXzGb7M/pnHqKnAWR CYMpYJyzE/K3jySuP8Mr+OiC2+Y8= X-Google-Smtp-Source: AGHT+IGZpRvoqdms7koHH7j4sjYtJnqCMsu2yqmTJKcDIkGjRDUrcU6fymzj+FFBEFGyqPi4YkL7k0HpgEKazclVgYM= X-Received: by 2002:a05:6871:a50c:b0:29e:2cd6:4d1d with SMTP id 586e51a60fabf-2a7fb00b99cmr22936483fac.9.1735660500946; Tue, 31 Dec 2024 07:55:00 -0800 (PST) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <631837de-c2a0-4bde-9400-5e8e5f2a0906@gmx.de> In-Reply-To: <631837de-c2a0-4bde-9400-5e8e5f2a0906@gmx.de> Date: Tue, 31 Dec 2024 16:54:52 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PHP-DEV] ext/gd new imagematch function (was: adding imagecompare) To: "Christoph M. Becker" Cc: PHP Internals List Content-Type: multipart/alternative; boundary="0000000000009c1de1062a92f36a" From: giovanni@giacobbi.net (Giovanni Giacobbi) --0000000000009c1de1062a92f36a Content-Type: text/plain; charset="UTF-8" On Tue, Dec 31, 2024, 13:42 Christoph M. Becker wrote: > On 11.07.2024 at 11:35, Giovanni Giacobbi wrote: > > > The recent PR #14877 [1] proposes to add the imagecompare gd function > that > > mimics the gdImageCompare function from libgd. I always thought that a > > pixel-by-pixel matching function for two images was a big missing feature > > in PHP, as the corresponding userland implementation is really, REALLY > slow. > > > > The problem with gdImageCompare is that it checks several aspects of the > > two images, and fails on its core purpose. The behavior is better seen in > > its source code [2] rather than explained. I see the following problems: > > - The gdImageCompare is kind of buggy, as it completely disregards the > > alpha channel. As a user, I would expect that two images that have a > pixel > > rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered > > different, but the current implementation reports them as identical. > > - Checking for bitmasks is cumbersome, and it requires 9 new constants in > > the global namespace (the IMG_CMP_*), which are nearly useless. Most use > > cases will just want to check for this: imagecompare($im1, $im2) & > > IMG_CMP_IMAGE. > > - The checks in gdImageCompare are also a bit arbitrary, why compare the > > interlace flag but not the palette order? Why check the number of colors > in > > the palette if they are not even used in the image? > > - If a user is interested in the other checks, they can all be > implemented > > in userland with existing functions: imageinterlace, > imagecolortransparent, > > imageistruecolor, imagesx, imagesy, imagecolorstotal. > > > > I believe PHP should only offer required building blocks and leave to the > > libraries the implementation of more structured functionality. My > proposal > > is a completely different function called imagematch() that solves the > > exact problem of the pixel-by-pixel matching, with the added value of > > optionally matching portions of the images, using the following > signature: > > > > function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int > $y1 > > = 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null): > > bool {} > > > > I already drafted PR #14914 [3] with the first two parameters. There is > not > > an equivalent libgd implementation, so the implementation is entirely on > > the php side, but as you can see it's quite trivial. Even with the > extended > > functionality of matching portions of the images it would be just a few > > extra lines of code, so not a big maintenance burden. > > > > If an RFC is required, I'm happy to redact it after the initial feedback > > round. > > > > [1] https://github.com/php/php-src/pull/14877 > > [2] > > > https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834 > > [3] https://github.com/php/php-src/pull/14914 > > Difficult. It seems to me we first need to consider the use-case. > Likely useful only for testing purposes, but I may miss some other > use-cases. > > And now comes the really hard part: when do we consider images to be > equal? You have mentioned that gdImageCompare() completely ignores the > alpha channel, and that would be a mistake. I tend to agree, but if the > images are stored via imagejpeg(), for instance, the alpha channel would > be ignored as well, so the resulting images would be equal. > For the "test-oriented" use case I agree, but it doesn't only apply to php's internal testsuite: i maintain quite a few projects which do image manipulation whose testsuites would benefit from this, and that already places us in userspace use case. For the "what to consider different" i would apply the KISS principle and just have a pixel-by-pixel yes/no difference, not even considering palette/truecolor (as per my sample PR implementation). If you need to do further checks (i.e. the mentioned palette/truecolor) you can still check in userspace for o(1). The idea being, can we provide the most expensive operation, that is basic pixel by pixel comparison, as native function? Otherwise we keep discussing bikeshedding and years go by without this being available. This does not stop future more advanced comparison algorithms from being introduced (i.e. imagediff()). > Then there is the question whether two fully transparent pixels (alpha > value 127) are different if their RGB channels have different values. > In theory, sure, but it practice is may not really matter. > Fully transparent black should be considered matching fully transparent white, cause there can be no possible discerning difference in the rendering. Then we have the transparent pixel. Should equality comparison take > that into account? Again, in theory, it is not the same as a pixel with > alpha channel 127, but in practice it may very well be. > Yes. Overlaying a black 100% opaque pixel is different from overlaying 50% black pixel, on a non-black surface. Again result is "rendering oriented". > Now you claim that doing pixel-to-pixel comparisons in PHP would be > *really* slow. I'd argue that with modern PHP the actual looping over > the pixels isn't that slow. Even getting the pixel colors > (imagecolorat()) and checking them for equality shouldn't be that slow. > I'm happy to run a quick bechmark and provider results, but my feeling is we are talking orders of magnitude (without jit, i personally don't consider it production ready). But I will also benchmarm it with JIT enabled. It only gets slow if you want to do more elaborate comparisons; e.g. > getting the actual values a palette entry refers to, but if you move > that out of the loop, performance might not be that bad. Implementing > that in C is certainly faster, but the difference should be measured, > and I wonder how much could be gained using SIMD instructions (something > libgd still ignores completely). > I'm talking about a naive pixel by pixel comparison, in plain C, without any special cpu optimization. Using simd would be nice but out of scope (that should definitely be a libgd discussion). I will gladly follow up with numbers in a few days. --0000000000009c1de1062a92f36a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Dec 31, 2024, 13:42 Christoph M.= Becker <cmbecker69@gmx.de> = wrote:
On 11.07.= 2024 at 11:35, Giovanni Giacobbi wrote:

> The recent PR #14877 [1] proposes to add the imagecompare gd function = that
> mimics the gdImageCompare function from libgd. I always thought that a=
> pixel-by-pixel matching function for two images was a big missing feat= ure
> in PHP, as the corresponding userland implementation is really, REALLY= slow.
>
> The problem with gdImageCompare is that it checks several aspects of t= he
> two images, and fails on its core purpose. The behavior is better seen= in
> its source code [2] rather than explained. I see the following problem= s:
> - The gdImageCompare is kind of buggy, as it completely disregards the=
> alpha channel. As a user, I would expect that two images that have a p= ixel
> rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
> different, but the current implementation reports them as identical. > - Checking for bitmasks is cumbersome, and it requires 9 new constants= in
> the global namespace (the IMG_CMP_*), which are nearly useless. Most u= se
> cases will just want to check for this: imagecompare($im1, $im2) &=
> IMG_CMP_IMAGE.
> - The checks in gdImageCompare are also a bit arbitrary, why compare t= he
> interlace flag but not the palette order? Why check the number of colo= rs in
> the palette if they are not even used in the image?
> - If a user is interested in the other checks, they can all be impleme= nted
> in userland with existing functions: imageinterlace, imagecolortranspa= rent,
> imageistruecolor, imagesx, imagesy, imagecolorstotal.
>
> I believe PHP should only offer required building blocks and leave to = the
> libraries the implementation of more structured functionality. My prop= osal
> is a completely different function called imagematch() that solves the=
> exact problem of the pixel-by-pixel matching, with the added value of<= br> > optionally matching portions of the images, using the following signat= ure:
>
> function imagematch(GdImage $image1, GdImage $image2, int $x1 =3D 0, i= nt $y1
> =3D 0, int $x2 =3D 0, int $y2 =3D 0, ?int $width =3D null, ?int $heigh= t =3D null):
> bool {}
>
> I already drafted PR #14914 [3] with the first two parameters. There i= s not
> an equivalent libgd implementation, so the implementation is entirely = on
> the php side, but as you can see it's quite trivial. Even with the= extended
> functionality of matching portions of the images it would be just a fe= w
> extra lines of code, so not a big maintenance burden.
>
> If an RFC is required, I'm happy to redact it after the initial fe= edback
> round.
>
> [1] https://github.com/php/php-src/pull/148= 77
> [2]
> https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c= 6321da918b96ef960/ext/gd/libgd/gd.c#L2834
> [3] https://github.com/php/php-src/pull/149= 14

Difficult.=C2=A0 It seems to me we first need to consider the use-case.
Likely useful only for testing purposes, but I may miss some other
use-cases.

And now comes the really hard part: when do we consider images to be
equal?=C2=A0 You have mentioned that gdImageCompare() completely ignores th= e
alpha channel, and that would be a mistake.=C2=A0 I tend to agree, but if t= he
images are stored via imagejpeg(), for instance, the alpha channel would be ignored as well, so the resulting images would be equal.

For the "te= st-oriented" use case I agree, but it doesn't only apply to php= 9;s internal testsuite: i maintain quite a few projects which do image mani= pulation whose testsuites would benefit from this, and that already places = us in userspace use case.

For the "what to consider different" i would apply the KISS pri= nciple and just have a pixel-by-pixel yes/no difference, not even consideri= ng palette/truecolor (as per my sample PR implementation). If you need to d= o further checks (i.e. the mentioned palette/truecolor) you can still check= in userspace for o(1).

= The idea being, can we provide the most expensive operation, that is basic = pixel by pixel comparison, as native function? Otherwise we keep discussing= bikeshedding and years go by without this being available. This does not s= top future more advanced comparison algorithms from being introduced (i.e. = imagediff()).


Then there is the question whether two fully transparent pixels (alpha
value 127) are different if their RGB channels have different values.
In theory, sure, but it practice is may not really matter.
=

Fully transparent= black should be considered matching fully transparent white, cause there c= an be no possible discerning difference in the rendering.

Then we have the transparent pixel.=C2=A0 Should equality comparison take that into account?=C2=A0 Again, in theory, it is not the same as a pixel wi= th
alpha channel 127, but in practice it may very well be.

Yes. Overlaying a bl= ack 100% opaque pixel is different from overlaying 50% black pixel, on a no= n-black surface. Again result is "rendering oriented".


Now you claim that doing pixel-to-pixel comparisons in PHP would be
*really* slow.=C2=A0 I'd argue that with modern PHP the actual looping = over
the pixels isn't that slow.=C2=A0 Even getting the pixel colors
(imagecolorat()) and checking them for equality shouldn't be that slow.=

= I'm happy to run a quick bechmark and provider results, but my feeling = is we are talking orders of magnitude (without jit, i personally don't = consider it production ready). But I will also benchmarm it with JIT enable= d.

It only gets slow if you want to do more elaborate comparisons; e.g.
getting the actual values a palette entry refers to, but if you move
that out of the loop, performance might not be that bad.=C2=A0 Implementing=
that in C is certainly faster, but the difference should be measured,
and I wonder how much could be gained using SIMD instructions (something libgd still ignores completely).

I'm talking about a naive pixel by pixe= l comparison, in plain C, without any special cpu optimization. Using simd = would be nice but out of scope (that should definitely be a libgd discussio= n).

I will gladly follow= up with numbers in a few days.
--0000000000009c1de1062a92f36a--