Can ImageMagick find a scaled down match of a large image? - imagemagick

I'm working on a project where we need to match original hi-resolution photos to their scaled down counterparts. For example the original may be 2000px x 2000px, and the scaled down version might be 500px x 500px.
In researching how to do this I've found mention that ImageMagick's compare operation can be used to compare larger and smaller images, but that it behaves as though the smaller image has been cropped from the larger--and as a result it performs a very intensive scan (http://www.imagemagick.org/discourse-server/viewtopic.php?f=2&t=16781#p61937).
Is there an option or flag that I can use to indicate that I only want a match if the smaller image has been scaled (not cropped) from the larger image?

You can temporarily scale the larger image down to the size of the smaller image and then compare the resized version to the thumbnails, as described by Marc Maurice on his blog.
convert bigimage.png -resize 500x500 MIFF:- | \
compare - -metric AE -fuzz '10%' smallimage.png null:
Because the resize algorithm is probably different from the original resize algorithm, this will introduce differences, but if the smaller images are only scaled and not changed otherwise, the similarities should be sufficient to do the matching. You'll have to find a suitable metric and threshold though.
If you don't now the thumbnail sizes or if they differ, you may want to downsize both images to a safe size below the minimum of all thumbnail sizes or you grab the thumbnail sizes with
identify -format "%w,%h" smallimage.png

Related

Imagemagick resized pictures are different using a single command or two commands

I can't understand why those two scripts seem to produce a different result, given that the second one is like the first one but separated into two commands.
First script:
convert lena_std.tif -compress None -resize 160x160 -compress None -resize 32x32 test1.bmp
Second script:
convert lena_std.tif -compress None -resize 160x160 test2.bmp
convert test2.bmp -compress None -resize 32x32 test3.bmp
I use the following command to check the difference between the results:
convert test1.bmp test3.bmp -metric AE -compare diff.bmp
I use Imagemagick on Ubuntu 22.04. My convert -version indicates: Version: ImageMagick 6.9.11-60 Q16 x86_64 2021-01-25.
Because when you scale you interpolate pixels.
Roughly, the code considers the pixel at (x,y) in the result, and computes where it comes from in the source. This is usually not an exact pixel, more like an area, when you scale down, or part of a pixel, when you scale up. So to make up the color of the pixel at (x,y) some math is applied: if you scale down, some averaging of the source area, and if you scale up, something that depends on how close the source is to the edge of the pixel and how different the color of neighboring pixels are.
This math can be very simple (the color of the closest pixel), simple (some linear average), a bit more complex (bi-cubic interpolation) or plain magic (sinc/Lanczos), the more complex forms giving the better results.
So, in one case, you obtain a result directly from the source to the pixel you want, and in the other you obtain the final result from an approximation of what the image would look at the intermediate size.
Another way to see it is that each interpolation has a spatial frequency response (like a filter in acoustics), and in one case you apply a single filter and in the other one you compose two filters.

Eliminate hairlines from a vector graphics by converting to oversampled bitmap and then downscaling - How with ImageMagick?

I used Apple Numbers (a Spreadsheet app with styling options) to create a UX flowchart of various user interfaces of an app.
Apple Numbers has a PDF export option.
The problem is that even though some border lines in the table have been set to "none" in the export you nevertheless get small visible hairlines, see this cutout:
[
I want to to eliminate the hairlines by image processing
Before creating a flyover video over the graphics.
My basic idea is:
Convert vector to bitmap with very high resolution (oversampling, e.g. to 600 or 1200 DPI)
Then downsample to the target resolution (e.g. 150 DPI) with an algorithm which eliminates the hairlines (disappearing in the dominance of neighboring pixels) while overally still remaining as crisp and sharp as possible.
So step 1, I already figured out, by these two possibilities:
a. Apple Preview has a PDF to PNG export option where you can specify the DPI.
b. ImageMagick convert -density 600 source.pdf export.png
But for step 2 there are so many possibilities:
resample <DPI> or -filter <FilterName> -resize 25% or -scale 12.5% (when from 1200 to 150)
Please tell me by which methods (resample, resize, scale) and which of the interpolation algorithms or filters I shall use to achieve my goal of eliminating the hairlines by dissolving them into their neighboring pixels, with the rest (normal 1px lines, rendered text and symbols, etc) remaining as crisp as possible.
ImageMagick PDF tp PNG conversion with different DPI settings:
convert -density XXX flowchart.pdf flowchart-ImageMagick-XXX.png
flowchart-ImageMagick-150.png ; flowchart-ImageMagick-300.png ; flowchart-ImageMagick-600.png
Apple Preview PDF to PNG export with different DPI settings:
flowchart-ApplePreview-150.png ; flowchart-ApplePreview-300.png ; flowchart-ApplePreview-600.png
Different downscaling processings
a) convert -median 3x3 -resize 50% flowchart-ApplePreview-300.png flowchart-150-from-ApplePreview-300-median-3x3.png thanks to the hint from #ChristophRackwitz
b) convert -filter Box -resize 25% flowchart-ImageMagick-600.png flowchart-150-from-ImageMagick-600-resize-box.png
Comparison
flowchart-ApplePreview-150.png
flowchart-150-from-ApplePreview-300-median-3x3.png
โœ… Hairlines gone
โŒ But font is not as crisp anymore, median destroyed that.
flowchart-150-from-ImageMagick-600-resize-box.png
๐Ÿ†— Overally still quite crisp
๐Ÿ†— Hairline only very very faint, even only faint when zoomed in
Both variants are somehow good enough for my KenBurns / Dolly cam ride over them. Still I wished that there'd be an algorithm that keeps cripness but still eliminates 1px lines in very high DPI bitmaps. But I guess this is a Jack of all trades only in my phantasy.
Processing Durations
MacBook Pro 15'' (Mid 2014, 2,5 GHz Quad-Core Intel Core i7)
ImageMagick PDF to PNG
PDF source Ca. 84x60cm (33x23'')
300dpi -> 27s
600dpi -> 1m58s
1200dpi -> 37m34s
ImageMagic Downscaling
time convert -filter Box -resize 25% 1#600.png 1#150-from-600.png
# PNG # 39700โ€Šร—โ€Š28066: 135.57s user 396.99s system 109% cpu 8:08.08 total
time convert -median 3x3 -resize 50% 2#300.png 2#150-from-300-median3x3.png
# PNG # 19850โ€Šร—โ€Š14033: 311.48s user 9.42s system 536% cpu 59.76 total
time convert -median 3x3 -resize 50% 3#300.png 3#150-from-300-median3x3.png
# PNG # 19850โ€Šร—โ€Š14033: 237.13s user 8.33s system 544% cpu 45.05 total

Scaling images before doing conversion or vice versa?

I wonder which one among methods below should preserve more details of images:
Down scaling BGRA images and then converting them to NV12/YV12.
Converting BGRA images to NV12/YV12 images and then down scaling them.
Thanks for your recommendation.
Updated 2020-02-04:
For my question is more clear, I want to desribe a little more.
The images is come from a video stream like this:
Video Stream
-> decoded to YV12.
-> converted to BGRA.
-> stamped texts.
-> scaling down (or YV12/NV12).
-> YV12/NV12 (or scaling down).
-> H264 encoder.
-> video stream.
The whole sequence of tasks ranges from 300 to 500ms.
The issue I have is text stamped over the images after converted
and scaled looks not so clear. I wonder order at items: 4. then .5 or .5 then.4
Noting that the RGB data is very likely to be non-linear (e.g. in an sRGB format) ideally you need to
Convert from the non-linear "R'G'B'" data to linear RGB (Note this needs higher bit precision per channel) (see function spec on wikipedia)
Apply your downscaling filter
Convert the linear result back to non-linear R'G'B' (ie. sRGB)
Convert this to YCbCr/NV12
Ideally you should always do filtering/blending/shading in linear space. To give you an intuitive justification for this, the average of black (0) and white (255) in linear colour space will be ~128 but in sRGB this mid grey is represented as (IIRC) 186. If you thus do your maths in sRGB space, your result will look unnaturally dark/murky.
(If you are in a hurry, you can sometimes get away with just using squaring (and sqrt()) as a kludge/hack to convert from sRGB to linear (and vice versa))
For avoiding two phases of spatial interpolation the following order is recommended:
Convert RGBA to YUV444 (YCbCr) without resizing.
Resize Y channel to your destination resolution.
Resize U (Cb) and V (Cr) channels to half resolution in each axis.
The result format is YUV420 in the resolution of the output image.
Pack the data as NV12 (NV12 is YUV420 in specific data ordering).
It is possible to do the resize and NV12 packing in a single pass (if efficiency is a concern).
In case you don't do the conversion to YUV444, U and V channels are going to be interpolated twice:
First interpolation when downscaling RGBA.
Second interpolation when U and V are downscaled by half when converting to 420 format.
When downscaling the image it's recommended to blur the image before downscaling (sometimes referred as "anti-aliasing" filter).
Remark: since the eye is less sensitive to chromatic resolution, you are probably not going to see any visible difference (unless image has fine resolution graphics like colored text).
Remarks:
Simon answer is more accurate in terms of color accuracy.
In most cases you are not going to see the difference.
The gamma information is lost when converting to NV12.
Update: Regarding "Text stamped over the images after converted and scaled looks not so clear":
In case getting clear text is the main issue, the following stages are suggested:
Downscale BGRA.
Stamp text (using smaller font).
Convert to NV12.
Downsampling an image with stamped text, is going to result unclear text.
A better solution is to stamp a test with smaller font, after downscaling.
Modern fonts uses vectored graphics, and not raster graphics, so stamping text with smaller font gives better result than downscaled image with stamped text.
NV12 format is YUV420, the U and V channels are downscaled by a factor of x2 in each axis, so the text quality will be lower compared to RGB or YUV444 format.
Encoding image with text is also going to damage the text.
For subtitles the solution is attaching the subtitles in a separate stream, and adding the text after decoding the video.

Compare scanned image (label) with original

There is an original high quality label. After it's been printed we scan a sample and want to compare it with original to find errors in printed text for example. Original and scanned images are almost of the same size (but a bit different).
ImageMagic can do it great but not with scanned image (I suppose it compares it bitwise but scanned image contains to much "noise").
Is there an utility that can so such a comparison? Or may be an algorithm (implemented or easy to implement) - like the one that uses Cauchyโ€“Schwarz inequality in signal processing?
Adding sample pics.
Original:-
Scanned:-
Further Thoughts
As I explained in the comments, I think the registration of the original and scanned images is going to be important as your scans are not exactly horizontal nor the same size. To do a crude registration, you could find some points of high-contrast that are hopefully unique in the original image. So, say I wanted one on the top-left (called tl.jpg), one in the top-right (tr.jpg), one in the bottom-left (bl.jpg) and one in the bottom-right (br.jpg). I might choose these:
[]
[]3
I can now find these in the original image and in the scanned image using a sub-image search, for example:
compare -metric RMSE -subimage-search original.jpg tl.jpg a.png b.png
1148.27 (0.0175214) # 168,103
That shows me where the sub-image has been found, and the second (greyish) image shows me a white peak where the image is actually located. It also tells me that the sub image is at coordinates [168,103] in the original image.
compare -metric RMSE -subimage-search scanned.jpg tl.jpg a.png b.png
7343.29 (0.112051) # 173,102
And now I know that same point is at coordinates [173,102] in the scanned image. So I need to transform [173,102] to [168,103].
I then need to do that for the other sub images:
compare -metric RMSE -subimage-search scanned.jpg br.jpg result.png
8058.29 (0.122962) # 577,592
Ok, so we can get 4 points, one near each corner in the original image, and their corresponding locations in the scanned image. Then we need to do an affine transformation - which I may, or may not do in the future. There are notes on how to do it here.
Original Answer
It would help if you were able to supply some sample images to show what sort of problems you are expecting with the labels. However, let's assume you have these:
label.png
unhappy.png
unhappy2.png
I have only put a red border around them so you can see the edges on this white background.
If you use Fred Weinhaus's script similar from his superb website, you can now compute a normalised cross correlation between the original image and the unhappy ones. So, taking the original label and the one with one track of white across it, they come out pretty similar (96%)
./similar label.png unhappy.png
Similarity Metric: 0.960718
If we now try the more unhappy one with two tracks across it, they are less similar (92%):
./similar label.png unhappy2.png
Similarity Metric: 0.921804
Ok, that seems to work. We now need to deal with the shifted and differently sized scan, so I will attempt to trim them to only get the important stuff and blur them to lose any noise and resize to a standardised size for comparison using a little script.
#!/bin/bash
image1=$1
image2=$2
fuzz="10%"
filtration="-median 5x5"
resize="-resize 500x300"
echo DEBUG: Preparing $image1 and $image2...
# Get cropbox from blurred image
cropbox=$(convert "$image1" -fuzz $fuzz $filtration -format %# info:)
# Now crop original unblurred image and resize to standard size
convert "$image1" -crop "$cropbox" $resize +repage im1.png
# Get cropbox from blurred image
cropbox=$(convert "$image2" -fuzz $fuzz $filtration -format %# info:)
# Now crop original unblurred image and resize to standard size
convert "$image2" -crop "$cropbox" $resize +repage im2.png
# Now compare using Fred's script
./similar im1.png im2.png
We can now compare the original label with a new image called unhappy-shifted.png
./prepare label.png unhappy-shifted.png
DEBUG: Preparing label.png and unhappy-shifted.png...
Similarity Metric: 1
And we can see they compare the same despite being shifted. Obviously I cannot see your images, how noisy they are, what sort of background you have, how big they are, what colour they are and so on - so you may need to adjust the preparation where I have just done a median filter. Maybe you need a blur and/or a threshold. Maybe you need to go to greyscale.

resize png image using rmagick without losing quality

I need to resize an 200*200 image to 60*60 in rmagick without losing image quality. Currently I am doing following for a png image
img = Magick::Image.from_blob(params[:file].read)[0]
img.write(RootPath + params[:dir_str] + "/#{filename}") do
self.quality=100;
# self.compression = Magick::ZipCompression
end
I am losing sharpness in the resulting image. I want to be able to resize by losing the least amount of image quality.
I tried to set it's quality and different compressions, but all of them seems not works fine.
all resulting image are still looks being removed a layer of colors and the word character are losing sharpness
anyone could give me some instructions for resizing png images?
You're resizing a picture from 200x200 = 40,000 down to 60x60 = 3,600 - that is, less than a tenth of the resolution - and you're surprised that you lose image quality? Think of it this way - could you take a 16x16 image and resize it to 5x5 with no loss of quality? That is about the same as you are trying to do here.
If what you are saying you want to do was actually possible, then every picture could be reduced to one pixel with no loss of quality.
With the art designer's 60x60 image being better quality than yours, it depends on the original size of the image that the art designer is working from. For example, if the art designer was working from an 800x800 image and provided your 200x200 image from that, then also reduced the original 800x800 image to 60x60 in PS then that 60x60 image will be better quailty than the one you have. This is because your 60x60 image has gone throuogh two losses of quality: one to go to 200x200 and the second to go from 200x200 to 60x60. Necessarily this will be worse than the image resized from the original.
You could convert the png to a vector image, resize the vector to 60x60 then convert the vector to png. Almost lossless.

Resources