The recent PR #14877 [1] proposes to add the imagecompare gd function that
mimics the gdImageCompare function from libgd. I always thought that a
pixel-by-pixel matching function for two images was a big missing feature
in PHP, as the corresponding userland implementation is really, REALLY slow.
The problem with gdImageCompare is that it checks several aspects of the
two images, and fails on its core purpose. The behavior is better seen in
its source code [2] rather than explained. I see the following problems:
- The gdImageCompare is kind of buggy, as it completely disregards the
alpha channel. As a user, I would expect that two images that have a pixel
rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
different, but the current implementation reports them as identical. - Checking for bitmasks is cumbersome, and it requires 9 new constants in
the global namespace (the IMG_CMP_*), which are nearly useless. Most use
cases will just want to check for this: imagecompare($im1, $im2) &
IMG_CMP_IMAGE. - The checks in gdImageCompare are also a bit arbitrary, why compare the
interlace flag but not the palette order? Why check the number of colors in
the palette if they are not even used in the image? - If a user is interested in the other checks, they can all be implemented
in userland with existing functions: imageinterlace, imagecolortransparent,
imageistruecolor, imagesx, imagesy, imagecolorstotal.
I believe PHP should only offer required building blocks and leave to the
libraries the implementation of more structured functionality. My proposal
is a completely different function called imagematch() that solves the
exact problem of the pixel-by-pixel matching, with the added value of
optionally matching portions of the images, using the following signature:
function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int $y1
= 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null):
bool {}
I already drafted PR #14914 [3] with the first two parameters. There is not
an equivalent libgd implementation, so the implementation is entirely on
the php side, but as you can see it's quite trivial. Even with the extended
functionality of matching portions of the images it would be just a few
extra lines of code, so not a big maintenance burden.
If an RFC is required, I'm happy to redact it after the initial feedback
round.
[1] https://github.com/php/php-src/pull/14877
[2]
https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834
[3] https://github.com/php/php-src/pull/14914
hi,
as I mentioned in the PR, this is an old function which we did not touch
for bc reasons.
I also linked a better image comparison function used for the gd tests
suite. It covers exaxf match which is what it seems you are aiming to get.
Additionally that function also return an image with the difference (the
diff images contains the channels differences).
A key, more tricky, but nevertheless important compared is a perceptual
difference, but that can be done additionally.
best,
Pierre
On Thu, Jul 11, 2024, 4:39 PM Giovanni Giacobbi giovanni@giacobbi.net
wrote:
The recent PR #14877 [1] proposes to add the imagecompare gd function that
mimics the gdImageCompare function from libgd. I always thought that a
pixel-by-pixel matching function for two images was a big missing feature
in PHP, as the corresponding userland implementation is really, REALLY slow.The problem with gdImageCompare is that it checks several aspects of the
two images, and fails on its core purpose. The behavior is better seen in
its source code [2] rather than explained. I see the following problems:
- The gdImageCompare is kind of buggy, as it completely disregards the
alpha channel. As a user, I would expect that two images that have a pixel
rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
different, but the current implementation reports them as identical.- Checking for bitmasks is cumbersome, and it requires 9 new constants in
the global namespace (the IMG_CMP_*), which are nearly useless. Most use
cases will just want to check for this: imagecompare($im1, $im2) &
IMG_CMP_IMAGE.- The checks in gdImageCompare are also a bit arbitrary, why compare the
interlace flag but not the palette order? Why check the number of colors in
the palette if they are not even used in the image?- If a user is interested in the other checks, they can all be implemented
in userland with existing functions: imageinterlace, imagecolortransparent,
imageistruecolor, imagesx, imagesy, imagecolorstotal.I believe PHP should only offer required building blocks and leave to the
libraries the implementation of more structured functionality. My proposal
is a completely different function called imagematch() that solves the
exact problem of the pixel-by-pixel matching, with the added value of
optionally matching portions of the images, using the following signature:function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int $y1
= 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null):
bool {}I already drafted PR #14914 [3] with the first two parameters. There is
not an equivalent libgd implementation, so the implementation is entirely
on the php side, but as you can see it's quite trivial. Even with the
extended functionality of matching portions of the images it would be just
a few extra lines of code, so not a big maintenance burden.If an RFC is required, I'm happy to redact it after the initial feedback
round.[1] https://github.com/php/php-src/pull/14877
[2]
https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834
[3] https://github.com/php/php-src/pull/14914
The recent PR #14877 [1] proposes to add the imagecompare gd function that
mimics the gdImageCompare function from libgd. I always thought that a
pixel-by-pixel matching function for two images was a big missing feature
in PHP, as the corresponding userland implementation is really, REALLY slow.The problem with gdImageCompare is that it checks several aspects of the
two images, and fails on its core purpose. The behavior is better seen in
its source code [2] rather than explained. I see the following problems:
- The gdImageCompare is kind of buggy, as it completely disregards the
alpha channel. As a user, I would expect that two images that have a pixel
rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
different, but the current implementation reports them as identical.- Checking for bitmasks is cumbersome, and it requires 9 new constants in
the global namespace (the IMG_CMP_*), which are nearly useless. Most use
cases will just want to check for this: imagecompare($im1, $im2) &
IMG_CMP_IMAGE.- The checks in gdImageCompare are also a bit arbitrary, why compare the
interlace flag but not the palette order? Why check the number of colors in
the palette if they are not even used in the image?- If a user is interested in the other checks, they can all be implemented
in userland with existing functions: imageinterlace, imagecolortransparent,
imageistruecolor, imagesx, imagesy, imagecolorstotal.I believe PHP should only offer required building blocks and leave to the
libraries the implementation of more structured functionality. My proposal
is a completely different function called imagematch() that solves the
exact problem of the pixel-by-pixel matching, with the added value of
optionally matching portions of the images, using the following signature:function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int $y1
= 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null):
bool {}I already drafted PR #14914 [3] with the first two parameters. There is not
an equivalent libgd implementation, so the implementation is entirely on
the php side, but as you can see it's quite trivial. Even with the extended
functionality of matching portions of the images it would be just a few
extra lines of code, so not a big maintenance burden.If an RFC is required, I'm happy to redact it after the initial feedback
round.[1] https://github.com/php/php-src/pull/14877
[2]
https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834
[3] https://github.com/php/php-src/pull/14914
Difficult. It seems to me we first need to consider the use-case.
Likely useful only for testing purposes, but I may miss some other
use-cases.
And now comes the really hard part: when do we consider images to be
equal? You have mentioned that gdImageCompare() completely ignores the
alpha channel, and that would be a mistake. I tend to agree, but if the
images are stored via imagejpeg(), for instance, the alpha channel would
be ignored as well, so the resulting images would be equal.
Then there is the question whether two fully transparent pixels (alpha
value 127) are different if their RGB channels have different values.
In theory, sure, but it practice is may not really matter.
Then we have the transparent pixel. Should equality comparison take
that into account? Again, in theory, it is not the same as a pixel with
alpha channel 127, but in practice it may very well be.
Now you claim that doing pixel-to-pixel comparisons in PHP would be
really slow. I'd argue that with modern PHP the actual looping over
the pixels isn't that slow. Even getting the pixel colors
(imagecolorat()) and checking them for equality shouldn't be that slow.
It only gets slow if you want to do more elaborate comparisons; e.g.
getting the actual values a palette entry refers to, but if you move
that out of the loop, performance might not be that bad. Implementing
that in C is certainly faster, but the difference should be measured,
and I wonder how much could be gained using SIMD instructions (something
libgd still ignores completely).
Christoph
The recent PR #14877 [1] proposes to add the imagecompare gd function
that
mimics the gdImageCompare function from libgd. I always thought that a
pixel-by-pixel matching function for two images was a big missing feature
in PHP, as the corresponding userland implementation is really, REALLY
slow.The problem with gdImageCompare is that it checks several aspects of the
two images, and fails on its core purpose. The behavior is better seen in
its source code [2] rather than explained. I see the following problems:
- The gdImageCompare is kind of buggy, as it completely disregards the
alpha channel. As a user, I would expect that two images that have a
pixel
rgba(255, 0, 0, 100%) and one rgba(255, 0, 0, 50%) are considered
different, but the current implementation reports them as identical.- Checking for bitmasks is cumbersome, and it requires 9 new constants in
the global namespace (the IMG_CMP_*), which are nearly useless. Most use
cases will just want to check for this: imagecompare($im1, $im2) &
IMG_CMP_IMAGE.- The checks in gdImageCompare are also a bit arbitrary, why compare the
interlace flag but not the palette order? Why check the number of colors
in
the palette if they are not even used in the image?- If a user is interested in the other checks, they can all be
implemented
in userland with existing functions: imageinterlace,
imagecolortransparent,
imageistruecolor, imagesx, imagesy, imagecolorstotal.I believe PHP should only offer required building blocks and leave to the
libraries the implementation of more structured functionality. My
proposal
is a completely different function called imagematch() that solves the
exact problem of the pixel-by-pixel matching, with the added value of
optionally matching portions of the images, using the following
signature:function imagematch(GdImage $image1, GdImage $image2, int $x1 = 0, int
$y1
= 0, int $x2 = 0, int $y2 = 0, ?int $width = null, ?int $height = null):
bool {}I already drafted PR #14914 [3] with the first two parameters. There is
not
an equivalent libgd implementation, so the implementation is entirely on
the php side, but as you can see it's quite trivial. Even with the
extended
functionality of matching portions of the images it would be just a few
extra lines of code, so not a big maintenance burden.If an RFC is required, I'm happy to redact it after the initial feedback
round.https://github.com/php/php-src/blob/5586d0c7de00acbfb217e1c6321da918b96ef960/ext/gd/libgd/gd.c#L2834
Difficult. It seems to me we first need to consider the use-case.
Likely useful only for testing purposes, but I may miss some other
use-cases.And now comes the really hard part: when do we consider images to be
equal? You have mentioned that gdImageCompare() completely ignores the
alpha channel, and that would be a mistake. I tend to agree, but if the
images are stored via imagejpeg(), for instance, the alpha channel would
be ignored as well, so the resulting images would be equal.
For the "test-oriented" use case I agree, but it doesn't only apply to
php's internal testsuite: i maintain quite a few projects which do image
manipulation whose testsuites would benefit from this, and that already
places us in userspace use case.
For the "what to consider different" i would apply the KISS principle and
just have a pixel-by-pixel yes/no difference, not even considering
palette/truecolor (as per my sample PR implementation). If you need to do
further checks (i.e. the mentioned palette/truecolor) you can still check
in userspace for o(1).
The idea being, can we provide the most expensive operation, that is basic
pixel by pixel comparison, as native function? Otherwise we keep discussing
bikeshedding and years go by without this being available. This does not
stop future more advanced comparison algorithms from being introduced (i.e.
imagediff()).
Then there is the question whether two fully transparent pixels (alpha
value 127) are different if their RGB channels have different values.
In theory, sure, but it practice is may not really matter.
Fully transparent black should be considered matching fully transparent
white, cause there can be no possible discerning difference in the
rendering.
Then we have the transparent pixel. Should equality comparison take
that into account? Again, in theory, it is not the same as a pixel with
alpha channel 127, but in practice it may very well be.
Yes. Overlaying a black 100% opaque pixel is different from overlaying 50%
black pixel, on a non-black surface. Again result is "rendering oriented".
Now you claim that doing pixel-to-pixel comparisons in PHP would be
really slow. I'd argue that with modern PHP the actual looping over
the pixels isn't that slow. Even getting the pixel colors
(imagecolorat()) and checking them for equality shouldn't be that slow.
I'm happy to run a quick bechmark and provider results, but my feeling is
we are talking orders of magnitude (without jit, i personally don't
consider it production ready). But I will also benchmarm it with JIT
enabled.
It only gets slow if you want to do more elaborate comparisons; e.g.
getting the actual values a palette entry refers to, but if you move
that out of the loop, performance might not be that bad. Implementing
that in C is certainly faster, but the difference should be measured,
and I wonder how much could be gained using SIMD instructions (something
libgd still ignores completely).
I'm talking about a naive pixel by pixel comparison, in plain C, without
any special cpu optimization. Using simd would be nice but out of scope
(that should definitely be a libgd discussion).
I will gladly follow up with numbers in a few days.
The idea being, can we provide the most expensive operation, that is basic
pixel by pixel comparison, as native function? Otherwise we keep discussing
bikeshedding and years go by without this being available. This does not
stop future more advanced comparison algorithms from being introduced (i.e.
imagediff()).
Okay, I basically agree. And given that this need is not urgent for
libgd per se (C clients can write their own functions without having a
performance impact), it is in my opionion fine to have some image
comparision in our extension binding, not necessarily using a function
provided by libgd.
Then we have the transparent pixel. Should equality comparison take
that into account? Again, in theory, it is not the same as a pixel with
alpha channel 127, but in practice it may very well be.Yes. Overlaying a black 100% opaque pixel is different from overlaying 50%
black pixel, on a non-black surface. Again result is "rendering oriented".
Oh, there is some misunderstanding here. In libgd, an alpha channel
value of 127 is maximum transparency (same as opacity:0 for CSS).
However, I'm talking about the special concept of the transparent pixel
(see imagecolortransparent()). This can be an arbitrary pixel value
(usually only used for palette images, but also at least partially
supported for truecolor images). That pixel value does not even need to
have an alpha channel greater than zero (note again, libgd alpha zero is
like CSS opacity:1), nor would its RGB values matter. This transparent
pixel is typically used with GIF, but also supported for PNG, and maybe
some other image formats.
I'm happy to run a quick bechmark and provider results, but my feeling is
we are talking orders of magnitude (without jit, i personally don't
consider it production ready). But I will also benchmarm it with JIT
enabled.
Yes, please do some performance comparisons. I didn't notice much
difference when working on PR #17305; at least not an order of magnitude.
Christoph