ImageMagick - alpha channel extract, different results (darker) on 6.7 vs 6.9 - image-processing

I'm working in a cross-platform environment where the results of the same, simple alpha channel extraction operation are different between ImageMagick 6.7.7-10 and ImageMagick 6.9.3-7.
The command to extract the alpha channel is:
convert image.png -alpha extract alpha.png
Or equivalently: convert image.png -channel A -separate alpha.png
Here is the original, the 6.7 output, and the 6.9 output:
Testing the original in Gimp, in the middle of the top dark bar, I can see that the original alpha value was 80% or 204:
The 6.9 output has grayscale value of 204, while the 6.7 output has a grayscale value of 154.
Now the question: I believe the 6.9 version is correct, but we like the visual result provided by the 6.7 version. Can we understand how the 6.7 version was working (maybe some different formula / luminence / color space?) and get the same result from the 6.9 version? Maybe apply some curve to the 6.9 output? Or some switch to make it use a different formula / color space? (Do color spaces even apply to PNGs?)

Post processing the 6.9 output with this simple gamma adjustment gives a very close approximation of the 6.7 output:
convert /tmp/alpha_channel_6.9.png -gamma 0.4472 /tmp/alpha_gamma_0.4472.png
Here's a gist of our solution in a shell script to detect 6.7 and apply the gamma adjustment selectively.
Note: Compared to the -fx answer, the gamma adjustment runs faster is more accurate, judging by less statistical error (MAE of 190 vs 408) found with:
compare -verbose -metric MAE /tmp/alpha_channel_6.7.png /tmp/alpha_curved.png null: 2>&1
compare -verbose -metric MAE /tmp/alpha_channel_6.7.png /tmp/alpha_gamma_0.4472.png null: 2>&1
But I'm going to leave the -fx answer in place, because I like the process of finding curves it describes.
Incidentally, this command lightens 6.7 output to look like 6.9 output:
convert /tmp/alpha_channel_6.7.png -gamma 2.22 /tmp/alpha_gamma_to_look_like_6.9.png
But with such a big gamma boost, the results are pretty ugly with color banding in the dark areas:

Deprecated: The -gamma answer is faster and provides better results, but I'll leave the below info, as it could be useful for other problems needing a "curves" solution.
Ok, I was able to post-process my 6.9 alpha channel output with a curves function so that it very closely matches the 6.7 alpha channel output.
However, if someone has a more concise switch, let me know!
Long story short, here's the post processing step:
It uses convert's -fx filter to apply curves to make 6.9 alpha channel output look like 6.7:
convert /tmp/alpha_channel_6.9.png -fx "-0.456633863737214*u^4 + 1.33965176221586*u^3 + -0.0837635742856742*u^2 + 0.199687083827961*u +0.00105015974839925" /tmp/alpha_curved_to_look_like_6.7.png
One could figure out the inverse function, to make 6.7 look like 6.9, given enough motivation.
Note to my future self, here's waaay too many details about how to derive this function:
Ok, so there's a page on ImageMagick's website about achieving a "curves" effect. The fun part is, it uses gnuplot to fit a polynomial to the curves function that'd you'd normally see in Gimp or Photoshop.
So I had the idea that I could create a test image (note, it's white with alpha, so not easy to see), run it through 6.7 alpha extract and 6.9 alpha extract, and visually compare them (on separate layers) in Gimp:
Then poke around in the curves tool on the 6.9 layer to make it look exactly like the 6.7 image:
Ok, so I found the curve I want. Now luckily, in Gimp, if you hover over the curve plot, it tells you the coordinates of the cursor, so I can find the coordinates of my curve to fit with gnuplot (as described in the link above. Note, I had to convert from 0-255 to 0.0-1.0.)
Cut super gory details, see this screencap of the general idea.
Note that I updated the ImageMagick code to fit a 4th degree polynomial, as it gave a better fit than their 3rd degree for me:
( echo 'f(x) = a*x**4 + b*x**3 + c*x**2 + d*x + e'; echo 'fit f(x) "fx_control.txt" via a, b, c, d, e'; echo 'print a,"*x^4 + ",b,"*x^3 + ",c,"*x^2 + ",d,"*x +",e'; ) | gnuplot 2>&1 | tail -1 > fx_funct.txt
This output:
-0.456633863737214*x^4 + 1.33965176221586*x^3 + -0.0837635742856742*x^2 + 0.199687083827961*x +0.00105015974839925
Ok, I used the above to generate a function of X, and plotted it using desmos.com, then screencapped that plot (in red) to overlay and compare it to the gimp curves. Looks pretty close to me:
So finally, switch the x's to u's, and plug it into ImageMagick, and voila, my 6.9 output looks like my 6.7 output once again:
convert /tmp/alpha_channel_6.9.png -fx "-0.456633863737214*u^4 + 1.33965176221586*u^3 + -0.0837635742856742*u^2 + 0.199687083827961*u +0.00105015974839925" /tmp/alpha_curved_to_look_like_6.7.png

Related

Research Paper Implementation (Image Processing)

I'm trying to implement this paper on my own, but there're some parts I don't fully understand.
UIumbra has three channels, since it's the result of multiplication between I and n, where I is an original (color) image.
Q. Step 4 requires a binarization image B1 from UIumbra. It uses an integral image technique for binarization, which is equivalent to OpenCV's adaptiveThreshold. Unfortunately, adaptiveThreshold() takes a grayscale image. Is there any method to convert UIumbra to grayscale or does cv2.cvtCOLOR(UI, COLOR_BGR2GRAY) suffice?
Q. LBWF is a binary version of LWF. LWF takes and returns a grayscale image. How do you make a binary version? (ex. binarize the input?)
The paper doesn't explain those details, so I'm having troubles.
(I did send an email to the author, waiting for the answer. Meanwhile, I want to hear your thoughts)
Any help or idea is appreciated.

different results when openning an image into numpy array using cv2.imread and PIL.Image.open

I am trying to open an image and turn it into a numpy array.
I have tried:
1) cv2.imread which gives you a numpy array directly
2) and PIL.Image.open then do a numpy.asarray to convert the image object.
Then i realise the resulting array from the same picture is different, please see the attached screenshots.
cv2.imread
PIL.Image.open
I would expect the color channel should always have the same sequence, no matter the package, but I do not seem to be able find any documentation for pillow reagarding this.
Or am I just being silly? Thanks in advance for any suggestion!!!
I don't know anything about PIL but, contrary to just about every other system in the world, OpenCV stores images in BGR order, not RGB. That catches every OpenCV beginner by surprise and it looks like that's the case with your example.
Opencv
image = cv2.imread(image_path, 1)
image_cv = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
Pillow
image_from_pil = Image.open(image_path).convert("RGB")
image_pillow = numpy.array(image_from_pil)
image_np equals image_cv
Notes: While reading a JPEG image, image_np and image_cv may be little different because the libjpeg version may be different in OpenCV and Pillow.
As SSteve and Xin Yang correctly say, the main problem could be that cv2 returns spatial domain (pixels) in BGR color plane, instead of usual RGB. You need to convert the output (reverse the order in channels' axis or use cv2.cvtColor).
Even after the color plane conversion, the output might not be the same. Both PIL and cv2 use libjpeg under the hood, but the outputs of libjpeg do differ for different versions. Read this research paper for reference. Based on my experiments I can say that the libjpeg version used by PIL is unpredictable (differs even on two identical MacBook Pro 2020 M1 using brew and the same Python and PIL version).
If it does matter and you want to have control over which libjpeg/libjpeg-turbo/mozjpeg version is used for compression and decompression, use jpeglib. It is still in beta, but the production release is coming.

C++ TIFF (raw) to JPEG : Faster than ImageMagick?

I need to convert many TIFF images to JPEG per second. Currently I'm using libmagick++ (Q16). I'm in the process of compiling ImageMagick Q8 as I read that it may improve performance (specially because I'm only working with 8bit images).
CImg also looks like a good option and GraphicsMagick claims to be faster than ImageMagic. I haven't tested either of those yet, but I was wondering if there are any other alternatives that could be faster than using ImageMagick Q8?
I'm looking for a Linux only solution.
UPDATE width GraphicsMagick & ImageMagick Q8
Base comparison (see comment to Mark): 0.2 secs with ImageMagick Q16
I successfully compiled GraphicsMagick with Q8, but after all, it seems about 30% slower than ImageMagick (0.3 secs).
After compiling ImageMagick with Q8, there was a gain of about 25% (0.15 secs). Nice :)
UPDATE width VIPS
Thanks to Mark's post, I give it a try to VIPS. Using the 7.38 version that is found in Ubuntu Trusty repositories:
time vips copy input.tiff output.jpg[Q=95]
real 0m0.105s
user 0m0.130s
sys 0m0.038s
Very nice :)
I also tried with the 7.42 (from ppa:dhor/myway) but it seems slighlty slower:
real 0m0.134s
user 0m0.168s
sys 0m0.039s
I will try to compile VIPS from source and see if I can beat that time. Well done Mark!
UPDATE: with VIPS 8.0
Compiled from source, vips-8.0 gets practically the same performance than 7.38:
real 0m0.100s
user 0m0.137s
sys 0m0.031s
Configure command:
./configure CC=c99 CFLAGS=-O2 --without-magick --without-OpenEXR --without-openslide --without-matio --without-cfitsio --without-libwebp --without-pangoft2 --without-zip --without-png --without-python
I have a few thoughts...
Thought 1
If your input images are 15MB and, for argument's sake, your output images are 1MB, you are already using 80MB/s of disk bandwidth to process 5 images a second - which is already around 50% of what a sensible disk might sustain. I would do a little experiment with using a RAMdisk to see if that might help, or an SSD if you have one.
Thought 2
Try experimenting with using VIPS from the command line to convert your images. I benchmarked it like this:
# Create dummy input image with ImageMagick
convert -size 3288x1152! xc:gray +noise gaussian -depth 8 input.tif
# Check it out
ls -lrt
-rw-r--r--# 1 mark staff 11372808 28 May 11:36 input.tif
identify input.tif
input.tif TIFF 3288x1152 3288x1152+0+0 8-bit sRGB 11.37MB 0.000u 0:00.000
Convert to JPEG with ImageMagick
time convert input.tif output.jpg
real 0m0.409s
user 0m0.330s
sys 0m0.046s
Convert to JPEG with VIPS
time vips copy input.tif output.jpg
real 0m0.218s
user 0m0.169s
sys 0m0.036s
Mmm, seems a good bit faster. YMMV of course.
Thought 3
Depending on the result of your test on disk speed, if your disk is not the limiting factor, consider using GNU Parallel to process more than one image at a time if you have a quad core CPU. It is pretty simple to use and I have always had excellent results with it.
For example, here I sequentially process 32 TIFF images created as above:
time for i in {0..31} ; do convert input-$i.tif output-$i.jpg; done
real 0m11.565s
user 0m10.571s
sys 0m0.862s
Now, I do exactly the same with GNU Parallel, doing 16 in parallel at a time
time parallel -j16 convert {} {.}.jpg ::: *tif
real 0m2.458s
user 0m15.773s
sys 0m1.734s
So, that's now 13 images per second, rather than 2.7 per second.

How does ImageMagick's '-subimage-search' operation work?

I have used ImageMagick in my application. I used ImageMagick for comparing images using the compare command with the -subimage-search option.
But there is very little documentation of about how -subimage-search works.
Can anyon provide me more information on how it works? For example:
Does it compare using colormodel or does it image segmentation to achieve its task?
What I know right now is it searches for the second image in the first.
But how this is done? Please explain.
Warning: Conducting a subimage-search is slow -- extremely slow even.
Theory
This slowness is due to how the subimage searching is designed to work: it is carries out a compare of the small image at every possible position within the larger image (with that area it currently covers at this location).
The basic command to use -subimage-search is this:
compare -subimage-search largeimage.ext subimage.ext resultimage.ext
As a result of this command you should get not one, but two images:
results-0.ext : this image should display the (best) matching location.
results-1.ext : this should be a "heatmap" of potential top-left corner locations.
The second image (map of locations) displays how well the sub-image matches at the respective position: the brighter the pixel, the better the match.
The "map" image has smaller dimensions, because it contains only locations or each potential top-left corner of the sub-image while fitting completely into the larger one. Its dimensions are:
width = width_of_largeimage - width_of_subimage + 1
height = height_of_largeimage - height_of_subimage + 1
The searching itself is conducted on the basis of differences of color vectors. Therefore it should result in fairly accurate color comparisons.
In order to improve efficiency and speed of searching, you could follow this strategical plan:
First, compare a very, very small sub-image of the sub-image with the larger image. This should find different possible locations faster.
Then use the results from step 1 to conduct a difference compare at each previously discovered potential location for more accurate matches.
Practical Example
Let's create two different images first:
convert rose: subimage.jpg
convert rose: -mattecolor blue -frame 20x5 largeimage.png
The first image, sub-image.jpg (on the left), being a JPEG, will have some lossiness in the color encodings, so sub-image can not possibly create an exact match.
The main difference of second image, largeimage.png (on the right), will be the blue frame around the main part:
Now time the compare-command:
time compare -subimage-search largeimage.png subimage.jpg resultimage.png
# 40,5
real 0m17.092s
user 0m17.015s
sys 0m0.027s
Here are the results:
resultimage-0.png (displaying best matching location) on the left;
resultimage-1.png (displaying the "heatmap" of potential matches) on the right.
Conclusion: Incorrect result? Bug?
Looking at the resulting images, and knowing how the two images were constructed, it seems to me that the result is not correct:
The command should have returned # 20,5 instead of # 40,5.
The resultimage-0.png should have the red area moved to the left by 20 pixels.
The heatmap, resultimage-1.png seems to indicate the best matching location as the darkest pixel; maybe I was wrong about my above "the brighter the pixel the better the match" statement, and it should be "the darker the pixel..."?.
I'll submit a bug report to the ImageMagick developers and see what they have to say about it....
Update
As suggested by #dlemstra, a ImageMagick developer, I tested with adding a -metric operation to the subimage-search. This operation returns a numerical value indicating the closeness of a match. There are various metrics available, which can be listed with
convert -list metric
This returns the following list on my notebook (running ImageMagick v6.9.0-0 Q16 x86_64):
AE Fuzz MAE MEPP MSE NCC PAE PHASH PSNR RMSE
The meanings of these abbreviations are:
AE : absolute error count, number of different pixels (-fuzz effected)
Fuzz : mean color distance
MAE : mean absolute error (normalized), average channel error distance
MEPP : mean error per pixel (normalized mean error, normalized peak error)
MSE : mean error squared, average of the channel error squared
NCC : normalized cross correlation
PAE : peak absolute (normalized peak absolute)
PHASH : perceptual hash
PSNR : peak signal to noise ratio
RMSE : root mean squared (normalized root mean squared)
An interesting (and relatively recent) metric is phash ('perceptual hash'). It is the only one that does not require identical dimensions for comparing images directly (without the -subimage-search option). It normally is the best 'metric' to narrow down similarly looking images (or at least to reliably exclude these image pairs which look very different) without really "looking at them", on the command line and programatically.
I did run the subimage-search with all these metrics, using a loop like this:
for m in $(convert -list metric); do
echo "METRIC $m";
compare -metric "$m" \
-subimage-search \
largeimage.png \
sub-image.jpg \
resultimage---metric-${m}.png;
echo;
done
This was the command output:
METRIC AE
compare: images too dissimilar `largeimage.png' # error/compare.c/CompareImageCommand/976.
METRIC Fuzz
1769.16 (0.0269957) # 20,5
METRIC MAE
1271.96 (0.0194089) # 20,5
METRIC MEPP
compare: images too dissimilar `largeimage.png' # error/compare.c/CompareImageCommand/976.
METRIC MSE
47.7599 (0.000728769) # 20,5
METRIC NCC
0.132653 # 40,5
METRIC PAE
12850 (0.196078) # 20,5
METRIC PHASH
compare: images too dissimilar `largeimage.png' # error/compare.c/CompareImageCommand/976.
METRIC PSNR
compare: images too dissimilar `largeimage.png' # error/compare.c/CompareImageCommand/976.
METRIC RMSE
1769.16 (0.0269957) # 20,5
So the following metric settings did not work at all with -subimage-search, as also indicated by the "images too dissimilar" message:
PSNR, PHASH, MEPP, AE
(I'm actually a bit surprised that the failed metrics include the PHASH one here. This may require further investigations...)
The following resultimages looked largely correct:
resultimage---metric-RMSE.png
resultimage---metric-FUZZ.png
resultimage---metric-MAE.png
resultimage---metric-MSE.png
resultimage---metric-PAE.png
The following resultimages look similarly incorrect as my first run above where no -metric result was asked for:
resultimage---metric-NCC.png (also returning the same incorrect coordinates as # 40,5)
Here are the two resulting images for -metric RMSE (what Dirk Lemstra had suggested to use):

Detecting blobs that connects to any other blob, maybe with OpenCV

In the image, I connects(has a bridge, binds) to universe, but II and III not.
I need to detect both II and III, also I if possible.
Is it possible with current computer vision libraries?
Or any path, idea that i can use to draw my own algorithm?
Thanks.
It is possible, but hard to express a generic pre-processing solution without a good bunch of sample images.
One solution could be
frame -> morphological closing + skeletonize + find contours (gives all)
frame -> skeletonize + find contours (gives 2 and 3)
difference gives 1 obviously,
and maybe with some addition of shape matching of those contours with an hand-written eye-like contour -just for an extra check.

Resources