Is the 90 degree image rotation with graphicsmagick or imagemagick always lossless?
E.g. when doing
gm convert -rotate 90 img.img rot90.img.img
gm convert -rotate -90 rot90.img.img back.img
will img.img and back.img be equal?
The answer to this depends more on the particular image format you're using, rather than the internals of Image/GraphicsMagick (assuming they're competently written).
With a raw format (e.g. BMP), there should be no reason for this not to be completely identical.
With a lossless format, it's possible there may be some subtle variations due to numerical precision.
With a lossy format (e.g. JPEG), it's almost certain there will be differences. In the case of JPEG for example, the compression of each 8x8 block is affected by the block to its left - if you rotate the image then that spatial relationship will change.
Related
I can't understand why those two scripts seem to produce a different result, given that the second one is like the first one but separated into two commands.
First script:
convert lena_std.tif -compress None -resize 160x160 -compress None -resize 32x32 test1.bmp
Second script:
convert lena_std.tif -compress None -resize 160x160 test2.bmp
convert test2.bmp -compress None -resize 32x32 test3.bmp
I use the following command to check the difference between the results:
convert test1.bmp test3.bmp -metric AE -compare diff.bmp
I use Imagemagick on Ubuntu 22.04. My convert -version indicates: Version: ImageMagick 6.9.11-60 Q16 x86_64 2021-01-25.
Because when you scale you interpolate pixels.
Roughly, the code considers the pixel at (x,y) in the result, and computes where it comes from in the source. This is usually not an exact pixel, more like an area, when you scale down, or part of a pixel, when you scale up. So to make up the color of the pixel at (x,y) some math is applied: if you scale down, some averaging of the source area, and if you scale up, something that depends on how close the source is to the edge of the pixel and how different the color of neighboring pixels are.
This math can be very simple (the color of the closest pixel), simple (some linear average), a bit more complex (bi-cubic interpolation) or plain magic (sinc/Lanczos), the more complex forms giving the better results.
So, in one case, you obtain a result directly from the source to the pixel you want, and in the other you obtain the final result from an approximation of what the image would look at the intermediate size.
Another way to see it is that each interpolation has a spatial frequency response (like a filter in acoustics), and in one case you apply a single filter and in the other one you compose two filters.
I used Apple Numbers (a Spreadsheet app with styling options) to create a UX flowchart of various user interfaces of an app.
Apple Numbers has a PDF export option.
The problem is that even though some border lines in the table have been set to "none" in the export you nevertheless get small visible hairlines, see this cutout:
[
I want to to eliminate the hairlines by image processing
Before creating a flyover video over the graphics.
My basic idea is:
Convert vector to bitmap with very high resolution (oversampling, e.g. to 600 or 1200 DPI)
Then downsample to the target resolution (e.g. 150 DPI) with an algorithm which eliminates the hairlines (disappearing in the dominance of neighboring pixels) while overally still remaining as crisp and sharp as possible.
So step 1, I already figured out, by these two possibilities:
a. Apple Preview has a PDF to PNG export option where you can specify the DPI.
b. ImageMagick convert -density 600 source.pdf export.png
But for step 2 there are so many possibilities:
resample <DPI> or -filter <FilterName> -resize 25% or -scale 12.5% (when from 1200 to 150)
Please tell me by which methods (resample, resize, scale) and which of the interpolation algorithms or filters I shall use to achieve my goal of eliminating the hairlines by dissolving them into their neighboring pixels, with the rest (normal 1px lines, rendered text and symbols, etc) remaining as crisp as possible.
ImageMagick PDF tp PNG conversion with different DPI settings:
convert -density XXX flowchart.pdf flowchart-ImageMagick-XXX.png
flowchart-ImageMagick-150.png ; flowchart-ImageMagick-300.png ; flowchart-ImageMagick-600.png
Apple Preview PDF to PNG export with different DPI settings:
flowchart-ApplePreview-150.png ; flowchart-ApplePreview-300.png ; flowchart-ApplePreview-600.png
Different downscaling processings
a) convert -median 3x3 -resize 50% flowchart-ApplePreview-300.png flowchart-150-from-ApplePreview-300-median-3x3.png thanks to the hint from #ChristophRackwitz
b) convert -filter Box -resize 25% flowchart-ImageMagick-600.png flowchart-150-from-ImageMagick-600-resize-box.png
Comparison
flowchart-ApplePreview-150.png
flowchart-150-from-ApplePreview-300-median-3x3.png
✅ Hairlines gone
❌ But font is not as crisp anymore, median destroyed that.
flowchart-150-from-ImageMagick-600-resize-box.png
🆗 Overally still quite crisp
🆗 Hairline only very very faint, even only faint when zoomed in
Both variants are somehow good enough for my KenBurns / Dolly cam ride over them. Still I wished that there'd be an algorithm that keeps cripness but still eliminates 1px lines in very high DPI bitmaps. But I guess this is a Jack of all trades only in my phantasy.
Processing Durations
MacBook Pro 15'' (Mid 2014, 2,5 GHz Quad-Core Intel Core i7)
ImageMagick PDF to PNG
PDF source Ca. 84x60cm (33x23'')
300dpi -> 27s
600dpi -> 1m58s
1200dpi -> 37m34s
ImageMagic Downscaling
time convert -filter Box -resize 25% 1#600.png 1#150-from-600.png
# PNG # 39700 × 28066: 135.57s user 396.99s system 109% cpu 8:08.08 total
time convert -median 3x3 -resize 50% 2#300.png 2#150-from-300-median3x3.png
# PNG # 19850 × 14033: 311.48s user 9.42s system 536% cpu 59.76 total
time convert -median 3x3 -resize 50% 3#300.png 3#150-from-300-median3x3.png
# PNG # 19850 × 14033: 237.13s user 8.33s system 544% cpu 45.05 total
I wonder which one among methods below should preserve more details of images:
Down scaling BGRA images and then converting them to NV12/YV12.
Converting BGRA images to NV12/YV12 images and then down scaling them.
Thanks for your recommendation.
Updated 2020-02-04:
For my question is more clear, I want to desribe a little more.
The images is come from a video stream like this:
Video Stream
-> decoded to YV12.
-> converted to BGRA.
-> stamped texts.
-> scaling down (or YV12/NV12).
-> YV12/NV12 (or scaling down).
-> H264 encoder.
-> video stream.
The whole sequence of tasks ranges from 300 to 500ms.
The issue I have is text stamped over the images after converted
and scaled looks not so clear. I wonder order at items: 4. then .5 or .5 then.4
Noting that the RGB data is very likely to be non-linear (e.g. in an sRGB format) ideally you need to
Convert from the non-linear "R'G'B'" data to linear RGB (Note this needs higher bit precision per channel) (see function spec on wikipedia)
Apply your downscaling filter
Convert the linear result back to non-linear R'G'B' (ie. sRGB)
Convert this to YCbCr/NV12
Ideally you should always do filtering/blending/shading in linear space. To give you an intuitive justification for this, the average of black (0) and white (255) in linear colour space will be ~128 but in sRGB this mid grey is represented as (IIRC) 186. If you thus do your maths in sRGB space, your result will look unnaturally dark/murky.
(If you are in a hurry, you can sometimes get away with just using squaring (and sqrt()) as a kludge/hack to convert from sRGB to linear (and vice versa))
For avoiding two phases of spatial interpolation the following order is recommended:
Convert RGBA to YUV444 (YCbCr) without resizing.
Resize Y channel to your destination resolution.
Resize U (Cb) and V (Cr) channels to half resolution in each axis.
The result format is YUV420 in the resolution of the output image.
Pack the data as NV12 (NV12 is YUV420 in specific data ordering).
It is possible to do the resize and NV12 packing in a single pass (if efficiency is a concern).
In case you don't do the conversion to YUV444, U and V channels are going to be interpolated twice:
First interpolation when downscaling RGBA.
Second interpolation when U and V are downscaled by half when converting to 420 format.
When downscaling the image it's recommended to blur the image before downscaling (sometimes referred as "anti-aliasing" filter).
Remark: since the eye is less sensitive to chromatic resolution, you are probably not going to see any visible difference (unless image has fine resolution graphics like colored text).
Remarks:
Simon answer is more accurate in terms of color accuracy.
In most cases you are not going to see the difference.
The gamma information is lost when converting to NV12.
Update: Regarding "Text stamped over the images after converted and scaled looks not so clear":
In case getting clear text is the main issue, the following stages are suggested:
Downscale BGRA.
Stamp text (using smaller font).
Convert to NV12.
Downsampling an image with stamped text, is going to result unclear text.
A better solution is to stamp a test with smaller font, after downscaling.
Modern fonts uses vectored graphics, and not raster graphics, so stamping text with smaller font gives better result than downscaled image with stamped text.
NV12 format is YUV420, the U and V channels are downscaled by a factor of x2 in each axis, so the text quality will be lower compared to RGB or YUV444 format.
Encoding image with text is also going to damage the text.
For subtitles the solution is attaching the subtitles in a separate stream, and adding the text after decoding the video.
I have captured a burst of 5 dng images from a Nexus6P for scientific imaging. The pixel intensities from the image will be mapped to my measurement value. For further processing the 5 dng images are averaged to reduce the noise and converted to png. I am using the below code to achieve this
convert dng:*.dng -average out.png
I would like to know if any processing is being done on the dng image, changing the pixel intensity values while conversion as it would affect my final calibration.
Version: ImageMagick 7.0.3-4, Windows 10
I'm working on a project where we need to match original hi-resolution photos to their scaled down counterparts. For example the original may be 2000px x 2000px, and the scaled down version might be 500px x 500px.
In researching how to do this I've found mention that ImageMagick's compare operation can be used to compare larger and smaller images, but that it behaves as though the smaller image has been cropped from the larger--and as a result it performs a very intensive scan (http://www.imagemagick.org/discourse-server/viewtopic.php?f=2&t=16781#p61937).
Is there an option or flag that I can use to indicate that I only want a match if the smaller image has been scaled (not cropped) from the larger image?
You can temporarily scale the larger image down to the size of the smaller image and then compare the resized version to the thumbnails, as described by Marc Maurice on his blog.
convert bigimage.png -resize 500x500 MIFF:- | \
compare - -metric AE -fuzz '10%' smallimage.png null:
Because the resize algorithm is probably different from the original resize algorithm, this will introduce differences, but if the smaller images are only scaled and not changed otherwise, the similarities should be sufficient to do the matching. You'll have to find a suitable metric and threshold though.
If you don't now the thumbnail sizes or if they differ, you may want to downsize both images to a safe size below the minimum of all thumbnail sizes or you grab the thumbnail sizes with
identify -format "%w,%h" smallimage.png