Crop page from facing-page scan - opencv

Suppose I want to crop just the left pages from facing-page scans of a spiral notebook like the example below (from Paolini.net).
Is there a more robust way than simply dividing the image's width by half? For example, a smarter algorithm would detect the spiral binding and make that the right boundary and even exclude black area to the left of the page.
If there's a relatively easy way to do this with OpenCV or ImageMagick, I'd love to learn it.

One possible way in ImageMagick 6 with Unix scripting is to do the following:
Trim the image to remove most of the black on the sides
Scale the image down to 1 row, then scale up to 50 rows just for visualization
Threshold the scaled image so that you get the black region down the spine as the largest black region
Do connected components process to find the x coordinate of the largest black region
Crop the image according to the results from the connected components
Input:
convert img.jpg -fuzz 25% -trim +repage img_trim.png
convert img_trim.png -scale x1! -scale x50! -threshold 80% img_trim_x1.png
centx=$(convert img_trim_x1.png -type bilevel \
-define connected-components:mean-color=true \
-define connected-components:verbose=true \
-connected-components 4 null: | \
grep "gray(0)" | head -n 1 | awk '{print $3}' | cut -d, -f1)
convert img_trim.png -crop ${centx}x+0+0 img_result.jpg
Data from connected components has the following header and structure:
Objects (id: bounding-box centroid area mean-color):
So head -n 1 gets the first black, i.e. gray(0) region which is the largest (sorted largest to smallest). The awk prints the 3rd entry, centroid, and the cut gets the x component.
If using ImageMagick 7, then change convert to magick
If you want to exclude the binders in the middle, then use the x-offset of the bounding box from the connected components listing:
convert img_trim.png -scale x1! -scale x50! -threshold 80% img_trim_x1.png
leftcenterx=$(convert img_trim_x1.png -type bilevel \
-define connected-components:mean-color=true \
-define connected-components:verbose=true \
-connected-components 4 null: | \
grep "gray(0)" | head -n 1 | awk '{print $2}' | cut -d+ -f2 | cut -d+ -f1)
convert img_trim.png -crop ${leftcenterx}x+0+0 img_result2.jpg
If you want just both pages, then we can find the white regions, i.e. gray(255) and crop them according to the width and x offset from the bounding boxes.
convert img.jpg -fuzz 25% -trim +repage img_trim.png
convert img_trim.png -scale x1! -scale x50! -threshold 80% img_trim_x1.png
OLDIFS=$IFS
IFS=$'\n'
bboxArr=(`convert img_trim_x1.png -type bilevel \
-define connected-components:mean-color=true \
-define connected-components:area-threshold=100 \
-define connected-components:verbose=true \
-connected-components 4 null: | \
grep "gray(255)" | awk '{print $2}'`)
IFS=$OLDIFS
num=${#bboxArr[*]}
for ((i=0; i<num; i++)); do
WW=`echo ${bboxArr[$i]} | cut -dx -f1`
Xoff=`echo ${bboxArr[$i]} | cut -d+ -f2`
convert img_trim.png -crop ${WW}x+${Xoff}+0 img_result3_$i.jpg
done

Related

how to remove black bands that occur regularly in an image in imagemagick

I've captured an image, and the image capture extension have left regular black bands that occur at regular intervals (see example below)
Is there an imagemagick command to remove all bands at once? I've tried to run it recursively, using the below pseudo-code, without success:
for i=1 to height of image/1000
split image at 1000 pixels * i
crop 10 pixels, top
stitch image with cropped image
EDIT: changed example image to a full resolution one
Here is how to crop each white section of your slides in ImageMagick 6 in Unix.
#
# threshold image
# use morphology to close up small black or white regions
# convert to bilevel
# do connected-component processing to find all regions larger than 1000 pixels in area
# keep only gray(255) i.e. white regions and get the bounding box and color and replace WxH+X+Y with W H X Y.
# sort by Y (rather than area) and put the x and +s back to re-form WxH+X+Y
# loop over data to get the bounding box and crop the image
#
OLD_IFS=$IFS
IFS=$'\n'
arr=(`convert slides.jpg -threshold 25% \
-morphology close rectangle:5 +write x1.png \
-morphology open rectangle:5 +write x2.png \
-type bilevel \
-define connected-components:verbose=true \
-define connected-components:exclude-header=true \
-define connected-components:area-threshold=1000 \
-define connected-components:mean-color=true \
-connected-components 8 y.png | grep "gray(255)" | sed 's/[x+]/ /g' | awk '{print $2, $3, $4, $5}'`)
IFS=$OLD_IFS
num=${#arr[*]}
echo $num
echo "${arr[*]}"
# sort array by Y value
sortArr=(`echo "${arr[*]}" | sort -n -t " " -k4,4 | sed -n 's/^\(.*\) \(.*\) \(.*\) \(.*\)$/\1x\2+\3+\4/p'`)
echo "${sortArr[*]}"
for ((i=0; i<num; i++)); do
bbox="${sortArr[$i]}"
convert slides.jpg -crop $bbox +repage slides_section_$i.jpg
done
For Imagemagick 7, change "convert" to "magick"

Print pixel values where images differ with imagemagick?

Is there an easy way to print the pixel value where two images differ using imagemagick?
To be clear, I want to know what the value of that pixel is, as well as its coordinate. It doesn't matter from which image, since I can simply swap them to get the right one.
Let's make two images, both 3px wide and 1px tall:
convert xc:red xc:lime xc:blue +append 1.png
convert 1.png -flop 2.png
If we do the following, we can make any pixels that are identical in the two images become transparent:
convert {1,2}.png -compose changemask -composite mask.png # Note that {1,2}.png is just bash shorthand for "1.png" "2.png"
And if we re-order the input images:
convert {2,1}.png -compose changemask -composite mask.png # Note that {2,1}.png is just bash shorthand for "2.png" "1.png"
So, I assume you want the above, but in text format with the transparent pixels suppressed:
convert {1,2}.png -compose changemask -composite txt: | grep -v ",0)"
Output
# ImageMagick pixel enumeration: 3,1,65535,srgba
0,0: (65535,0,0,65535) #FF0000FF red
2,0: (0,0,65535,65535) #0000FFFF blue
Note that if you want to permit a small difference between the images, you can add some "fuzz-factor". So, if you want rgb(0,100,200) to be considered near enough equal to rgb(3,96,205), you could add -fuzz 5 at the start of the command.
In Imagemagick 6, you can do the following to list the coordinates where the two images differ:
convert image1 image2 -compose difference -composite -threshold 0 txt: | tail -n +2 | grep "white" | awk '{print $1}' | sed 's/://g'
If using Imagemagick 7, change convert to magick
ADDITION:
If you want both the coordinates and the color in one of the two images, then assuming the image has no perfect black pixels, you can do the following:
convert image1 image2 \
\( -clone 0,1 -compose difference -composite -threshold 0 \) \
-delete 1 \
-compose multiply -composite txt: |\
tail -n +2 | grep -v "black" | awk '{print $1,$4}'
For example, I take the lena image and put a blue square in the top left corner to make a second image.
Input:
convert lena.png \( -clone 0 -size 5x5 xc:blue -composite \) \
\( -clone 0,1 -compose difference -composite -threshold 0 \) \
-delete 1 \
-compose multiply -composite txt: |\
tail -n +2 | grep -v "black" | awk '{print $1,$4}'
Results:
0,0: srgb(226,137,124)
1,0: srgb(224,137,130)
2,0: srgb(225,135,121)
3,0: srgb(228,134,121)
4,0: srgb(227,138,125)
0,1: srgb(226,137,124)
1,1: srgb(224,137,131)
2,1: srgb(225,135,121)
3,1: srgb(228,134,121)
4,1: srgb(227,138,126)
0,2: srgb(226,138,124)
1,2: srgb(224,136,127)
2,2: srgb(225,135,120)
3,2: srgb(228,134,121)
4,2: srgb(227,137,121)
0,3: srgb(228,137,122)
1,3: srgb(225,134,114)
2,3: srgb(225,134,118)
3,3: srgb(229,132,112)
4,3: srgb(227,133,113)
0,4: srgb(224,130,109)
1,4: srgb(223,132,110)
2,4: srgb(224,132,116)
3,4: srgb(226,131,112)
4,4: srgb(226,134,117)
If you do have black and the images have no transparency, then you can do:
convert image1 image2 \
\( -clone 0,1 -compose difference -composite -threshold 0 \) \
-delete 1 \
-alpha off -compose copy_opacity -composite \
-background black -alpha background txt: |\
tail -n +2 | grep -v "none" | awk '{print $1,$4}'
For example:
convert lena.png \( -clone 0 -size 5x5 xc:blue -composite \) \
\( -clone 0,1 -compose difference -composite -threshold 0 \) \
-delete 1 \
-alpha off -compose copy_opacity -composite \
-background black -alpha background txt: |\
tail -n +2 | grep -v "none" | awk '{print $1,$4}'
Results:
0,0: srgba(226,137,124,1)
1,0: srgba(224,137,130,1)
2,0: srgba(225,135,121,1)
3,0: srgba(228,134,121,1)
4,0: srgba(227,138,125,1)
0,1: srgba(226,137,124,1)
1,1: srgba(224,137,131,1)
2,1: srgba(225,135,121,1)
3,1: srgba(228,134,121,1)
4,1: srgba(227,138,126,1)
0,2: srgba(226,138,124,1)
1,2: srgba(224,136,127,1)
2,2: srgba(225,135,120,1)
3,2: srgba(228,134,121,1)
4,2: srgba(227,137,121,1)
0,3: srgba(228,137,122,1)
1,3: srgba(225,134,114,1)
2,3: srgba(225,134,118,1)
3,3: srgba(229,132,112,1)
4,3: srgba(227,133,113,1)
0,4: srgba(224,130,109,1)
1,4: srgba(223,132,110,1)
2,4: srgba(224,132,116,1)
3,4: srgba(226,131,112,1)
4,4: srgba(226,134,117,1)

Bold Table Bars Removal using ImageMagick

I have an Invoice Image which contains table bars as below example.
I am using ImageMagick to pre-process Images using the below command.
convert 0.png -type Grayscale -negate -define morphology:compose=darken -morphology Thinning 'Rectangle:1x80+0+0<' -negate 0.png
My Problem is that output with horizontal bold bars. ImageMagick fails to convert it correctly and output as below.
What can I do to solve this?
Here is a different way using ImageMagick and connected components. First trim the image to remove the outer white, then use connected components to get the id of the largest black region, which should be id=0. The run it again removing the id of the largest area making it transparent and finally flattening the result against white. Then add the thinning operation to remove the horizontal lines that were not fully black. See https://imagemagick.org/script/connected-components.php
convert image.png -fuzz 5% -trim +repage \
-bordercolor black -border 1 \
-define connected-components:verbose=true \
-define connected-components:mean-color=true \
-connected-components 4 \
null:
Objects (id: bounding-box centroid area mean-color):
0: 953x205+0+0 478.7,65.6 31513 srgba(0,0,0,1)
10789: 943x19+5+184 488.4,193.1 16885 srgba(255,255,255,1)
1: 465x17+5+1 237.0,9.0 7905 srgba(255,255,255,1)
2: 474x17+474+1 733.5,9.0 7096 srgba(255,255,255,1)
3820: 281x21+667+67 807.0,76.9 5609 srgba(255,255,255,1)
5195: 281x21+667+90 807.0,99.9 5609 srgba(255,255,255,1)
7959: 281x20+667+137 807.0,146.4 5328 srgba(255,255,255,1)
9341: 281x20+667+160 807.0,169.5 5328 srgba(255,255,255,1)
6540: 281x20+667+114 807.0,123.4 5295 srgba(255,255,255,1)
2375: 281x19+667+46 807.0,55.0 5047 srgba(255,255,255,1)
...
convert image.png -fuzz 5% -trim +repage \
-bordercolor black -border 1 \
-define connected-components:remove=0 \
-define connected-components:mean-color=true \
-connected-components 4 \
-background white -flatten \
-negate \
-define morphology:compose=darken \
-morphology Thinning 'Rectangle:1x80+0+0<' \
-negate \
result.png

ImageMagick change all images to preset size with DPI

Trying to use imagemagick to have all images set to a preset size like a letterhead (8 1/2x11) for example... id prefer to not use resize and trying to get them to a 100 dpi setting... Im personally not very good with imagemagick and after 2 days of searching around Ive got it mostly complete?
for f in `ls *jpg`; do
convert -compress Group4 -type bilevel \
-depth 100 -units PixelsPerInch \
-monochrome -resize 850X1100 $f 2-$f;
done
Anyone have any further pointers on this?
You would use -density option to set the DPI.
for f in `ls *jpg`
do
convert -compress Group4 \
-type bilevel \
-depth 100 \
-units PixelsPerInch
-monochrome \
-resize 850X1100 \
-density 100 \
$f 2-$f
done
You can verify by using the identify utility.
identify -format "%x x %y" some_image.jpg
Edit:
As Birei pointed out. You can use "*.jpg" wildcard to iterate over the files in a directory, and quoting the output file name would be important for file names with spaces. You can use Filename Percent Escapes to create & preserve source image information.
convert *.jpg \
-compress Group4 \
-type bilevel \
-depth 100 \
-units PixelsPerInch
-monochrome \
-resize 850X1100 \
-density 100 \
-set filename:f '%f' \
'2-%[filename:f]'
The -set filename:f '%f' will preserver the original file name w/ proper escaping, and '2-%[filename:f]' will write the 'f' value with custom prefix '2-'. No need to use Bash for-loop.

ImageMagick: How to resize proportionally with mogrify without a background

I was following this example http://cubiq.org/create-fixed-size-thumbnails-with-imagemagick, and it's exactly what I want to do with the image, with the exception of having the background leftovers (i.e. the white borders). Is there a way to do this, and possibly crop the white background out? Is there another way to do this? The re-size needs to be proportional, so I don't just want to set a width re-size limit or height limit, but proportionally re-size the image.
The example you link to uses this command:
mogrify \
-resize 80x80 \
-background white \
-gravity center \
-extent 80x80 \
-format jpg \
-quality 75 \
-path thumbs \
*.jpg
First, mogrify is a bit dangerous. It manipulates your originals inline, and it overwrites the originals. If something goes wrong you have lost your originals, and are stuck with the wrong-gone results. In your case the -path thumbs however alleviates this danger, because makes sure the results will be written to sub directory thumbs
Another ImageMagick command, convert, can keep your originals and do the same manipulation as mogrify:
convert \
input.jpg \
-resize 80x80 \
-background white \
-gravity center \
-extent 80x80 \
-quality 75 \
thumbs/output.jpg
If want the same result, but just not the white canvas extensions (originally added to make the result a square 80x80 image), just leave away the -extent 80x80 parameter (the -background white and gravity center are superfluous too):
convert \
input.jpg \
-resize 80x80 \
-quality 75 \
thumbs/output.jpg
or
mogrify \
-resize 80x80 \
-format jpg \
-quality 75 \
-path thumbs \
*.jpg
I know this is an old thread, but by using the -write flag with the -set flag, one can write to files in the same directory without overwriting the original files:
mogrify -resize 80x80 \
-set filename:name "%t_small.%e" \
-write "%[filename:name]" \
*.jpg
As noted at http://imagemagick.org/script/escape.php, %t is the filename without extension and %e is the extension. So the output of image.jpg would be a thumbnail image_small.jpg.
This is the command I use each time I want to batch resized everything to 1920x and keep aspect ratio.
mogrify -path . -resize 1920x1920 -format "_resized.jpg" -quality 70 *.jpg

Resources