Why Imagemagick's morphology dilation algorithm differs from mathematical definition? - imagemagick

Origin Image
0 0 0
0 1 0
0 0 0
generated by:
$ convert -size 3x3 xc:black -fill white -draw 'point 1,1' origin.png
Dilation Process
Use a 2x1 rectangle as the kernel with central point (0,0):
processed by:
$ convert origin.png -morphology Dilate Rectangle:2x1+0+0 output.png
Expected Output
0 0 0
1 1 0
0 0 0
Actual Output
0 0 0
0 1 1
0 0 0
Question
Why the output is unexpected? I wonder how ImageMagick processes dilation.
Here is my understanding:
When the kernel's central point iterates to the position (0,1) of the original image:
I thought (0,1) should have been 1 after AND operations.

The "center point" of a 2x1 kernel is between pixels. So you have to choose which one is the official "origin". It is arbitrary. But you can set the origin in ImageMagick when defining the kernel. See https://imagemagick.org/Usage/morphology/#user
For example for a 2x1 kernel, it could be either
2x1+0+0: 0,1
or
2x1+1+0: 0,1

Related

What is the differrence between colormap and historam in the identify command output?

I use the identify command in the form below:
identify -verbose image.png
Part of the output is:
Colors: 8
Histogram:
49602: ( 49, 51, 39) #313327 srgb(49,51,39)
36492: ( 98,121,135) #627987 srgb(98,121,135)
21728: ( 98,182,240) #62B6F0 srgb(98,182,240)
39526: (121,131, 75) #79834B srgb(121,131,75)
34298: (165,171,147) #A5AB93 srgb(165,171,147)
29957: (185,200,226) #B9C8E2 srgb(185,200,226)
18767: (210,185, 67) #D2B943 srgb(210,185,67)
31774: (246, 69, 44) #F6452C srgb(246,69,44)
Colormap entries: 9
Colormap:
0: (121,131, 75) #79834B srgb(121,131,75)
1: ( 49, 51, 39) #313327 srgb(49,51,39)
2: (210,185, 67) #D2B943 srgb(210,185,67)
3: (165,171,147) #A5AB93 srgb(165,171,147)
4: (185,200,226) #B9C8E2 srgb(185,200,226)
5: ( 98,121,135) #627987 srgb(98,121,135)
6: ( 98,182,240) #62B6F0 srgb(98,182,240)
7: (246, 69, 44) #F6452C srgb(246,69,44)
8: (255,255,255) #FFFFFF white
I see that the same colors as in Histogram plus white, but in a different order appear also in the colormap.
What is the difference between the two?
The first line under Histogram:
49602: ( 49, 51, 39) #313327 srgb(49,51,39)
tells you that there are 49,602 pixels in the image with the colour sRGB(49,51,39). So it is telling you the frequency of occurrence, or how often, each colour occurs.
The 9 lines under Colormap: are the palette of the image.
Let's look at the first line:
0: (121,131, 75) #79834B srgb(121,131,75)
That means that wherever the color srgb(121,131,75) occurs in the image, we only store the palette index 0 at that location, rather than the colour 121,131,75. That means we only use 1 byte to store a 0 instead of storing 3 bytes of RGB, which means we save 2/3 of the space. It is a "LookUp Table" or palette.
Palettes trade space for colour accuracy. In general, they are 1/3 of the size of the original image, but can normally only store 256 unique colours rather than the 16,777,216 colours of a conventional RGB image.
Just for fun, let's create this smooth greyscale gradient and some random noise as a conventional RGB888 image (which comes out at 75kB):
magick -size 40x600 gradient: \( xc: +noise random \) +append -rotate 90 PNG24:a.png
And now do the same thing, but oblige ImageMagick to create a palette image (which comes out at 25kB):
magick -size 40x600 gradient: \( xc: +noise random \) +append -rotate 90 PNG8:a.png
There is a longer explanation with example here.

Does the position of the pixels in an image govern the edge detection?

Referring this video by andrew ng
https://youtu.be/XuD4C8vJzEQ?list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF
From this video I conclude that for detecting vertical edges in an image there should be some BRIGHTER followed by DARKER regions starting from the left side, then only this [[1,0,-1],[1,0,-1],[1,0,-1]] will act as a vertical edge detector otherwise not.
Is my conclusion correct ?
and
Is the vice versa will also be true ?
If you think about the filter:
1 0 -1
1 0 -1
1 0 -1
you will see that it is just subtracting the pixels to the right from the pixels to the left at each location, i.e. finding the horizontal differences.
As such, it is capable of finding transitions from light to dark and dark to light, it's just that the differences will show up with an opposite sign (plus or minus). So, if you transition from a bright area on the left to a darker area on the right, you will have a large number (bright) minus a small number (dark) and the difference will be positive. Conversely, if you transition from a dark area on the left (small number) to a brighter area on the right (large number), you will end up with a negative difference.
Here is an example, just done in Terminal with ImageMagick. Start with this image:
Apply the filter you are talking about:
magick input.png -morphology convolve '3x3: 1,0,-1 1,0,-1 1,0,-1' result.png
And you can see it finds only dark-to-light edges.
If you want to detect edges from light to dark and dark to light, you need to either:
use a signed number (as opposed to unsigned) so you can hold negative results, or
add a "bias" to your convolution.
If your data was unsigned 8-bit, you could add a 50% bias by dividing all your current values by 2 and adding 127 before convolving, for example.
So, applying a bias, your filter now finds dark-to-light and light-to-dark edges:
magick input.png -define convolve:scale='50%!' -bias 50% -morphology convolve '3x3: 1,0,-1 1,0,-1 1,0,-1' result.png
If you now want to detect horizontal edges transitioning from light-to-dark, rotate the filter to this:
-1 -1 -1
0 0 0
1 1 1
And apply:
magick input.png -morphology convolve '3x3: -1,-1,-1 0,0,0 1,1,1' result.png
Or, if you want to find horizontal edges transitioning from dark-to-light, use:
1 1 1
0 0 0
-1 -1 -1
magick input.png -morphology convolve '3x3: 1,1,1 0,0,0 -1,-1,-1' result.png
And the same again, but with a bias so we can find both light-to-dark and dark-to-light transitions in one fell swoop:
magick image.png -define convolve:scale='50%!' -bias 50% -morphology convolve '3x3: -1,-1,-1 0,0,0 1,1,1' result.png
Anthony Thyssen provides more excellent information about convolution than you could ever hope to need in a very approachable style here.

Does ImageMagick compare with RMSE always return 1 regardless of dissimilarity-threshold?

I'm using ImageMagick to compare files and I want it to return exit code 0. if the images are within some threshold of similarity. However, using metric RMSE and setting dissimilarity-threshold to allow some range of variability, it still returns 1. It only seems to return 0 when I give it 2 identical images.
For example:
> imageMagick compare -verbose -metric RMSE -dissimilarity-threshold 0.5 new_file.png old_file.png null
> echo $?
new_file.png PNG 1233x835 1233x835+0+0 8-bit sRGB 325677B 0.040u 0:00.040
old_file.png PNG 1233x835 1233x835+0+0 8-bit sRGB 325712B 0.040u 0:00.039
Image: new_file.png
Channel distortion: RMSE
red: 0 (0)
green: 0.358198 (5.46575e-06)
blue: 0.438701 (6.69415e-06)
alpha: 0 (0)
all: 0.283181 (4.32106e-06)
new_file.png=>null PNG 1233x835 1233x835+0+0 8-bit sRGB 216246B 0.210u 0:00.220
1
Since these two image files have such a small amount of difference and the total score calculated (0.283181) is less than my threshold of 0.5, I'd expect these two images to register as similar and return 0. (I've experimented with numerous dissimilarity-thresholds between 0.1 and up in the millions, but they also seem to have no effect.) Am I misunderstanding how to use this argument?
Edit: I know I can get the results I want using other combinations, like using -metric AE and -fuzz 0.5%, but I'm still curious, if I can use dissimilarity-threshold with RMSE.
In Imagemagick, -metric rmse returns 0 (0) for perfectly matching images. The first value in in the quantum range of the ImageMagick compile. The second number in parenthesis is in the range 0 to 1. So, it will return values of quantum range and (1) for totally mismatched images. The dissimilarity-threshold ranges from 0 to 1. Use 1 if you want to test dissimilar images and do not want it to complain that the images are too dissimilar. It is likely you won't need -dissimilarity-metric if you are testing two same sized images, but will need it if using -subimage-search.
RMSE is a measure of difference. So if the images are the same then the difference will be 0.
For example:
convert -size 100x100 xc:white white.png
convert -size 100x100 xc:gray gray.png
convert -size 100x100 xc:black black.png
echo $?
1
compare -metric rmse white.png white.png -format "\n" null:
0 (0)
echo $?
0
compare -metric rmse white.png gray.png -format "\n" null:
compare -metric rmse white.png black.png -format "\n" null:
65535 (1)
compare -metric rmse -dissimilarity-threshold 1 white.png black.png -format "\n" null:
65535 (1)
echo $?
1
compare -metric rmse -dissimilarity-threshold 0 white.png black.png -format "\n" null:
65535 (1)
echo $?
1
So for two same sized images, -dissimilarity-threshold is irrelevant.
Your command
echo $?
is returning whether the command finished successfully or not. It is not the value of the rmse metric.
convert -size 200x200 xc:white white.png
convert -size 100x100 xc:black black.png
compare -metric rmse -subimage-search white.png black.png -format "\n" null:
compare: images too dissimilar `white.png' # error/compare.c/CompareImageCommand/1148.
echo $?
2
compare -metric rmse -subimage-search -dissimilarity-threshold 1 white.png black.png -format "\n" null:
65535 (1) # 0,0
echo $?
1
So the return code seems to be giving 0 for a perfect match, 1 for a non-perfect match and 2 for an error.

How to stitch back cropped image with imageMagick?

I have a big big image, lets name it orig-image.tiff.
I want to cut it in smaller pieces, apply things on it, and stitch back together the newly created little images.
I cut it into pieces with this command :
convert orig-image.tiff -crop 400x400 crop/parts-%04d.tiff
then I'll generate many images by applying a treatment to each part-XXXX.tiff image and end up with images from part-0000.png to part-2771.png
Now I want to stitch back the images into a big one. Can imagemagick do that?
If you were using PNG format, the tiles would "remember" their original position, as #Bonzo suggests, and you could take them apart and reassemble like this:
# Make 256x256 black-red gradient and chop into 1024 tiles of 8x8 as PNGs
convert -size 256x256 gradient:red-black -crop 8x8 tile-%04d.png
and reassemble:
convert tile*png -layers merge BigBoy.png
That is because the tiles "remember" their original position on the canvas - e.g. +248+248 below:
identify tile-1023.png
tile-1023.png PNG 8x8 256x256+248+248 16-bit sRGB 319B 0.000u 0:00.000
With TIFs, you could do:
# Make 256x256 black-red gradient and chop into 1024 tiles of 8x8 as TIFs
convert -size 256x256 gradient:red-black -crop 8x8 tile-%04d.tif
and reassemble with the following but sadly you need to know the layout of the original image:
montage -geometry +0+0 -tile 32x32 tile*tif BigBoy.tif
Regarding Glenn's comment below, here is the output of pngcheck showing the "remembered" offsets:
pngcheck tile-1023*png
Output
OK: tile-1023.png (8x8, 48-bit RGB, non-interlaced, 16.9%).
iMac:~/tmp: pngcheck -v tile-1023*png
File: tile-1023.png (319 bytes)
chunk IHDR at offset 0x0000c, length 13
8 x 8 image, 48-bit RGB, non-interlaced
chunk gAMA at offset 0x00025, length 4: 0.45455
chunk cHRM at offset 0x00035, length 32
White x = 0.3127 y = 0.329, Red x = 0.64 y = 0.33
Green x = 0.3 y = 0.6, Blue x = 0.15 y = 0.06
chunk bKGD at offset 0x00061, length 6
red = 0xffff, green = 0xffff, blue = 0xffff
chunk oFFs at offset 0x00073, length 9: 248x248 pixels offset
chunk tIME at offset 0x00088, length 7: 13 Dec 2016 15:31:10 UTC
chunk vpAg at offset 0x0009b, length 9
unknown private, ancillary, safe-to-copy chunk
chunk IDAT at offset 0x000b0, length 25
zlib: deflated, 512-byte window, maximum compression
chunk tEXt at offset 0x000d5, length 37, keyword: date:create
chunk tEXt at offset 0x00106, length 37, keyword: date:modify
chunk IEND at offset 0x00137, length 0
No errors detected in tile-1023.png (11 chunks, 16.9% compression).

Explanation of hough transform for ImageMagick

Preview:
I have done a hough line detection using the below mentioned code:
convert image.jpg -threshold 90% -canny 0x1+10%+30% \
\( +clone -background none \
-fill red -stroke red -strokewidth 2 \
-hough-lines 5x5+80 -write lines.mvg \
\) -composite hough.png
And I wrote the details of the line in a .mvg file. the .mvg file contents are as shown below:
# Hough line transform: 5x5+80
viewbox 0 0 640 360
line 448.256,0 473.43,360 # 104
line 0,74.5652 640,29.8121 # 158
line 0,289.088 640,244.335 # 156
line 0,292.095 640,247.342 # 133
line 154.541,0 179.714,360 # 125
line 151.533,0 176.707,360 # 145
And check here the output hough.png file.
Problem:
What does #104, #158, #156... stands for, I guess they are line numbers. If so why they are numbered in such a way?
Also I would like to know how the co-ordinates has been assigned.
It will be really helpful if I can get an explanation for the contents in .mvg file.
The # <number> is the maxima value. It defaults to count, which is set by line_count, and in returned influenced by threshold you specified. The number will decrease if the matrix element count is greater than previous height/width iteration. So... If you give it a threshold of -hough-lines 5x5+80, then line 448.256,0 473.43,360 # 104 was found about 24 pixels(or lines?) past the threshold. The next iteration would drop the maxima below the 80 threashold, so we stop comparing the matrix elements.
Also I would like to know how the co-ordinates has been assigned.
I can only answer this by pseudo-quoting the source code, but it's basic trigonometry.
if ((x >= 45) %% (x <= 135)) {
y = (r-x cos(t))/sin(t)
else {
x = (r-y cos(t))/sin(t)
}
where r is defined as y - midpoint element matrix height
where t is defined as x + midpoint rows
Find out more in the HoughLineImage method located in feature.c

Resources