I don't see any rhyme or reason to the use of + vs -, and I've never seen a Unix command other than ImageMagick's suite (convert, etc.) which expects some flags to have a +. Win32 commands (and ports of them to Linux) sometimes use /, but never +.
I don't know the background behind this but personally find it quite rational. In general, the normal form is like other Linux commands preceded with a dash, or hyphen:
magick INPUT -something OUTPUT
The + form is used when there is a sense of:
negation, or
opposite direction, or
of resetting, disabling or clearing or
as a shorthand form.
There may be some overlap in these concepts, and maybe other additional ones exist.
So, in terms of "negation":
magick INPUT -fill red -opaque blue RESULT
will turn all blue pixels into red, whereas this command:
magick INPUT -fill red +opaque blue RESULT
will turn all non-blue pixels into red.
Similarly, -adjoin will clump multiple images together into a single output file if possible, whereas +adjoin will force multiple, separate output files even when it may have been possible to make, say, a multipage TIFF or animated GIF.
Another example is -level 10%,90% which will increase contrast so that the top and bottom 10% of the brightness range are discarded and the remaining 80% are stretched across the full, permissible brightness range. On the other hand, +level 10%,90% will decrease contrast by compressing the entire possible brightness range into the central 80% of the possible brightness range.
In terms of "opposite direction", this command will append images vertically below the first:
magick INPUT INPUT INPUT -append TALL_RESULT
whereas the following positive form will append images horizontally to the right:
magick INPUT INPUT INPUT +append WIDE_RESULT
In terms of "resetting, disabling or clearing", this command will use Riemer dithering:
magick INPUT -dither RiemerSMA ... RESULT
whereas the following positive form will disable dithering:
magick INPUT +dither ... RESULT
If you select a couple of channels to apply a filter or threshold to, you can reset back to the default channels afterwards:
magick INPUT -channel alpha -threshold 50% +channel RESULT
If you set a fuzz for some operation, you can reset it back to zero afterwards:
magick INPUT -fill red -fuzz 10% -opaque blue +fuzz -opaque yellow RESULT
which will set the fill-colour to red, then turn all pixels within 10% of blue into that red and also all perfectly yellow pixels into that fill-colour of red.
In terms of "shorthand", -swap 0,2 will swap the first and third image in a sequence, whereas +swap will swap the last two in the sequence regardless of how many there are. This is a common operation and the plus form is succinct compared to the conventional alternative -swap -1,-2
Likewise, -clone 2 will clone the third image in a sequence, whereas +clone will clone the last... again, a very common operation. Compare +clone with the more conventional-looking, but IMHO uglier alternative -clone -1
Likewise, +delete will delete the last image in a sequence.
Related
I am trying to use ImageMagick 7 to detect if a specific channel in an image is largely pure black and pure white (plus a little antialiasing, and there's a chance the image could be pure black). This is to distinguish from another kind of image that shares a naming convention but has photographic-like image data in the r/g/b channels.
(Basically both image types are specular maps from different engines. The one I'm trying to differentiate here is more modern and has the metallic map in the blue channel; the other is much older and just has the specular colour in the RGB channels and the gloss map in the alpha.)
Currently I'm comparing the channel to a clone of itself that has had a 50% threshold applied, using the AE metric to see if it's largely the same apart from a small amount of antialiasing, and a fuzz of 1% to account for occasional aberration from pure black/white. This command works, but of course at the moment it only returns the number of distorted pixels:
magick ( "file.png" -channel b -separate ) ^
( +clone -channel b -separate -threshold 50% ) ^
-fuzz 1% -metric AE -compare ^
-format "%[distortion]" info:
Because the input image sizes will vary, I want to divide the distortion by the total number of pixels in the image to get the relative amount of the image that's not pure black/white -- under 10% has seemed good so far in my manual testing -- but I can't get the format syntax right. Everything I've tried -- for example "%[fx:%[distortion]/w*h]" -- has given the magick: undefined variable `[distortion]' # error/fx.c/FxGetSymbol/1169 error.
What syntax should I use? (And if there's a better way to do what I'm doing, I always appreciate it!)
I believe the following is what you want in Imagemagick. Basically you save the distortion in -set option: argument and then use it in -fx later.
However, +clone gives you just the b channel, so there should be no need for -channel b -separate in your second line.
magick ( "file.png" -channel b -separate ) ^
( +clone -threshold 50% ) ^
-fuzz 1% -metric AE -compare ^
-set option:distort "%[distortion]" ^
-format "%[fx:distort/(w*h)]" info:
Fred (#fmw42) has already provided an excellent method. There is another method for differentiating pure black and white images from greyscale images with a fuller tonal scale which may interest you. Credit to Anthony Thyssen for the technique described here.
If you use -solarize 50% in ImageMagick it inverts all the highlights, so it effectively folds your histogram in half and all the whites become pure black and all the near-whites become near blacks. The command looks like this:
magick INPUT -solarize 50% OUTPUT
So, if I apply that to a couple of input images - the first one pure black and near white, the second a greyscale - and show the corresponding output image on the right you'll see the effect:
If you now inspect the mean and standard deviation of the two solarised images:
magick {a,b}-sol.jpg -format "%f, mean: %[mean], stdev: %[standard-deviation]\n" info:
a-sol.jpg, mean: 2328.91, stdev: 3175.67
b-sol.jpg, mean: 16319.5, stdev: 9496.04
you can see that the mean and standard deviation of the first (pure black and white) image is low because all the bright whites have folded to near blacks, whereas the mean and standard deviation of the greyscale image are both higher because the tones are more spread out.
When comparing two images, both of the images are the same, except that in one of the images the text is moved by a couple of pixels. Please take a look at the below URL. It is a GIF that shows the difference of both the similar images.
https://giphy.com/gifs/9x50JjoLSPZ7lKRebk
My team initially used compare command which doesn't address this issue. Need suggestions please?
You can remove all the text in Imagemagick and just compare the bars by thresholding the Saturation/Chroma channel and then doing the compare. The text is gray, so it has little if any saturation. The bars are cyan, so they are colored and have a medium to high saturation.
convert giphy.gif -colorspace HCL -channel g -separate +channel -threshold 5% +write tmp.gif miff:- | compare -metric rmse - null:
3164.96 (0.0482942)
So this is 4.8% different.
I save tmp.gif, which you do not need, only to show the result of the processing before the compare.
If your version of Imagemagick is too old and you do not have -colorspace HCL, then try HSL or HSB. C and S are similar and measure saturation/chroma.
I am trying to do OCR using Tesseract overall results seems acceptable. The images are very very long receipts and we are scanning using scanner, the quality is better. Only issue is that in receipts few characters are joint between two lines
Please see the attached sample image. You may see in the first line character 'p' and in the second line character M are joint. This is causing problem in OCR.
SO, the real question is may we add a white line or square between every text line ?
You can do that for this image in Imagemagick by trimming the image to remove surrounding white and adding the same amount of black. Then average that image down to one column and looking for the brightest row. I start and stop 4 pixels from the top and bottom to avoid any really bright rows in those regions. Once I find the brightest row, I splice in 4 rows of white between the top and bottom regions divided by that row. This is not the most elegant way. But it shows the potential. One could likely pipe the list of row values to AWK and search for the max value in more efficient manner than saving to an array and using a for loop. Unix syntax with Imagemagick.
Input:
max=0
row=0
arr=()
arr=(`convert text.png -fuzz 50% -trim -background black -flatten -colorspace gray -scale 1x! -depth 8 txt:- | tail -n +2 | sed -n 's/^.*gray[(]\(.*\)[)]$/\1/p'`)
num=${#arr[*]}
#echo "${arr[*]}"
for ((i=4; i<num-4; i++)); do
val="${arr[$i]}"
max=`convert xc: -format "%[fx:$val>$max?$val:$max]" info:`
row=`convert xc: -format "%[fx:$val==$max?$i:$row]" info:`
#echo "$i $val $max $row"
done
convert text.png -gravity north -splice 0x4+0+$row text2.png
If you want less space, you can change to -splice 0x1+0+$row, but it won't change much. It is not writing over your image, but inserting white between the existing rows.
But by doing the processing above, your OCR still may not recognize the p or M, since the bottom of the p is cut off and appended to the M.
If you have more than two lines of text, you will have to search the column for approximately evenly spaced maxima.
I have image like this from my windstation
I have tried get thoose lines recognized, but lost becuase all filters not recognize lines.
Any ideas what i have use to get it black&white with at least some needed lines?
Typical detection result is something like this:
I need detect edges of digit, which seams not recognized with almost any settings.
This doesn't provide you with a complete guide as to how to solve your image processing question with opencv but it contains some hints and observations that may help you get there. My weapon of choice is ImageMagick, which is installed on most Linux distros and is available for OS X and Windows.
Firstly, I note you have date and time across the top and you haven't cropped correctly at the lower right hand side - these extraneous pixels will affect contrast stretches, so I crop them off.
Secondly, I separate your image in 3 channels - R, G and B and look at them all. The R and B channels are very noisy, so I would probably go with the Green channel. Alternatively, the Lightness channel is pretty reasonable if you go to HSL mode and discard the Hue and Saturation.
convert display.jpg -separate channel.jpg
Red
Green
Blue
Now make a histogram to look at the tonal distribution:
convert display.jpg -crop 500x300+0+80 -colorspace hsl -separate -delete 0,1 -format %c histogram:png:ahistogram.png
Now I can see all your data are down the dark, left-hand end of the histogram, so I do a contrast stretch and a median filter to remove the noise
convert display.jpg -crop 500x300+0+80 -colorspace hsl -separate -delete 0,1 -median 9x9 -normalize -level 0%,40% z.jpg
And a final threshold to get black and white...
convert display.jpg -crop 500x300+0+80 -colorspace hsl -separate -delete 0,1 -median 9x9 -normalize -level 0%,40% -threshold 60% z.jpg
Of course, you can diddle around with the numbers and levels, but there may be a couple of ideas in there that you can develop... in OpenCV or ImageMagick.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Given RGB value's of all pixels of a image , how can we find the probability that the given pixel is of skin color and what percentage of the image is of skin color .
Noodling around on Google tells me that caucasian skin tones often, or maybe generally, or maybe sometimes conform to the following sort of rule:
Blue channel: 140-180
Green channel: Blue * 1.15
Red channel: Blue * 1.5
So, with that in mind, I made some colour swatches that correspond to that with ImageMagick, using this command line:
#!/bin/bash
for b in $(seq 140 5 180); do
g=$(echo "$b * 1.15/1" | bc)
r=$(echo "$b * 1.5/1" | bc)
convert -label "R:$r,G:$g,B:$b" -size 200x200 xc:"rgb($r,$g,$b)" miff:-
done | montage - -frame 5 -tile 3x swatches.png
And got this:
Ok, those look kind of reasonable, now I try to use those to detect skin tones, again with ImageMagick. For the moment, and just so you can see it, I am going to colour lime-green everthing I detect as a skin-tone, using this which is right in the middle of the tonal range identified above:
convert -fuzz 5% face1.jpg -fill lime -opaque "rgb(240,184,160)" out.jpg
Mmmm, not very good. Increase the fuzziness maybe?
Mmmm, still pretty rubbish - picking up only part of the skin and some of the white shirt collar. Different face maybe?
Ok, not bad at detecting him, although notice it completely fails to detect the right side of his face, however there are still a few problems as we can see from the pink cadillac:
and Miss Piggy below...
Maybe we can be a bit more targeted in our search, and, though I can't draw it in 3-D, I can explain in 2-D. Instead of targeting a single large circle (actually sphere in 3-D space) in the middle of our range, maybe we could target some smaller circles spread along our range and thereby include fewer extraneous colours... the magenta represents the degree of fuzz. So rather than this:
we could do this:
using this command:
convert -fuzz 13% face1.jpg -fill lime \
-opaque "rgb(219,168,146)" \
-opaque "rgb(219,168,146)" \
-opaque "rgb(255,198,172)" out.jpg
So, you can see it is pretty hard to find skin-tones just by using RGB values and I haven't even started to address different races, different lighting etc.
Another approach may be to use a different colourspace, such as HSL - Hue Saturation and Lightness. We are not so interested in Lightness because that is just a function of exposure, so we look for hues that match those of skin and some degree of saturation to avoid washed out colours. You can do that with ImageMagick like this:
#!/bin/bash
convert face1.jpg -colorspace hsl -separate \
\( -clone 0 -threshold 7% -negate +write h.png \) \
\( -clone 1 -threshold 30% +write s.png \) \
-delete 0-2 -evaluate-sequence min out.png
That says this... take the image face1.jpg and convert it to HSL colorspace, then separate the layers so we now have 3 images in our stack. image 0 is the Hue, image 1 is the Saturation and image 2 is the Lightness. Next line. Take the Hue layer and threshold it at 7% which means pinky-reds, invert it and save it (just so you can see it) as h.png. Next line. Take the Saturation layer, and say "any saturation over 30% is good enough for me", then save as file s.png. Next line. Delete the 3 original layers (HS&L) from the original image leaving just the thresholded Hue and thresholded Saturation layers. Now put these ontop of each other and choose whichever is the minimum and save that. The point is that either the Hue or the Saturation layer can be used to gate which pixels are selected.
Here are the files, first the Hue (h.png):
next the Saturation (s.png):
and now the combined output file.
Once you have got your algorithm sorted out for deciding which pixels are skin coloured, you will need to count them to work out the percentages you seek. That is pretty easy... all we do is change everything that is not lime-green to black (so it counts for zero in the averaging) and then resize the image to a single pixel and get its colour as text:
convert -fuzz 13% face1.jpg -fill lime \
-opaque "rgb(219,168,146)" \
-opaque "rgb(219,168,146)" \
-opaque "rgb(255,198,172)" \
-fill black +opaque lime -resize 1x1! txt:
# ImageMagick pixel enumeration: 1,1,255,srgb
0,0: (0,92,0) #005C00 srgb(0,92,0)
We can see there is, not surprisingly, no red and no blue and the average colour of the green pixels is 92/255, so 36% of pixels match our description of skin-toned.
If you want to get more sophisticated you may have to look at shapes, textures and contexts, or train a skin classifier and write a whole bunch of stuff in OpenCV or somesuch...