Find the first black pixel in every row with ImageMagick - imagemagick

For every row in an image, I would like to find the first black (or first non-white) pixel in that row. For example, for an image like this:
I would expect output like:
0
1
0
Or something close to that that I can parse. I think there might be a way to do this with subimage-search but I don't quite know how. Any pointers?

You do NOT need subimage-search to achieve your goal. The problem can be reduced to text parsing.
1. Basics
Consider this: you can tell ImageMagick to convert any image to a textual representation, which holds the exact color information for each individual pixel. Example:
convert wizard: textwizard.txt
(wizard: is a builtin image available for all ImageMagick installations for testing purposes.)
Yes, it is that easy! This image "format" is requested by just adding a .txt suffix. Results:
# ImageMagick pixel enumeration: 480,640,255,srgb
0,0: (255,255,255) #FFFFFF white
1,0: (255,255,255) #FFFFFF white
2,0: (255,255,255) #FFFFFF white
[....]
47,638: (246,247,249) #F6F7F9 srgb(246,247,249)
48,638: (246,247,249) #F6F7F9 srgb(246,247,249)
47,639: (236,235,236) #ECEBEC srgb(236,235,236)
48,639: (230,228,218) #E6E4DA srgb(230,228,218)
[....]
476,639: (255,255,255) #FFFFFF white
477,639: (255,255,255) #FFFFFF white
478,639: (255,255,255) #FFFFFF white
479,639: (255,255,255) #FFFFFF white
If you look at the first line of the output, you'll notice that ImageMagick uses it to detail some special info about the image here:
# ImageMagick pixel enumeration: 480,640,255,srgb
It means:
the image is 480 pixels wide,
the image is 640 pixels high,
the image uses a range of 0-255 for color info per channel (that is equivalent to an 8-bit color depth),
the image is build in the sRGB color space
The other lines consist of 4 columns:
the first column in format (N,M) indicates the exact position of the respective pixels as (row_number,column_number). (The index for row and column numbers is zero-based -- row no. 1 is indicated as 0, no. 2 as 1.)
the other three columns, redundantly, each hold the exact same information, each in a different notation: the exact color value for the pixel given in column 1. (The last column will use a human-readable name if ImageMagick knows one for that color value...)
As a side note: you can use such a textual representation of the original image (with or without some extra modifications) to re-create a real image:
convert textwizard.txt wizard.jpg
2. Select a specific row
You should be aware that you can select a specific region of an image with the following syntax:
image.png[WIDTHxHEIGHT+X_OFFSET+Y_OFFSET]
So to select a specific row only, you can set HEIGHT as 1. To get any row completely, set X-OFFSET as 0. To get a specific row, set the Y-OFFSET accordingly.
In order to get the values (for the builtin wizard: image used above) for the row with index 47, we can do:
convert wizard:[640x1+0+47] row47.txt
cat row47.txt
# ImageMagick pixel enumeration: 480,1,255,srgb
0,0: (255,255,255) #FFFFFF white
1,0: (255,255,255) #FFFFFF white
2,0: (255,255,255) #FFFFFF white
[....]
428,0: (82,77,74) #524D4A srgb(82,77,74)
429,0: (169,167,168) #A9A7A8 srgb(169,167,168)
430,0: (232,231,228) #E8E7E4 srgb(232,231,228)
432,0: (246,247,249) #F6F7F9 srgb(246,247,249)
[....]
476,0: (255,255,255) #FFFFFF white
477,0: (255,255,255) #FFFFFF white
478,0: (255,255,255) #FFFFFF white
479,0: (255,255,255) #FFFFFF white
If you do not want the textual output in a file, but printed on the standard output channel, you can do this:
convert wizard:[480x1+0+47] txt:-
3. Stitching it all together
Based on above snippets of information, the approach that can be taken with this task is clear:
Loop through all pixel rows of the image.
Output each pixel's color value as text.
Look for the first non-white pixel and keep its location information.
4. Possible script (OS X, Linux, Unix)
Here is a main part of a Bash script that could be used:
# Define some image specific variables (width, height, ...)
image=${1}
number_of_columns=$(identify -format '%W' ${image})
width=${number_of_columns} # just an alias
number_of_rows=$(identify -format '%H' ${image})
height=${number_of_rows} # just an alias
max_of_indices=$(( ${height} -1 ))
# Loop through all rows and grep for first non-white pixel
for i in $(seq 0 ${max_of_indices}); do
echo -n "Row ${i} : " ;
convert ${image}[${width}x1+0+${i}] txt:- \
| grep -v enumeration \
| grep -v '#FFFFFF' -m 1 \
|| echo "All WHITE pixels in row!"
done
The -v white will de-select all lines which contain the string white.
The -m 1 parameter will return the maximum of 1 matches (i.e. the first match).
It will be slow, but it will work.

I would go with something like this using the built-in checkerboard pattern:
convert -size 100x100 pattern:checkerboard -auto-level board.png
#!/bin/bash
convert wizard: txt: | awk -F'[,: ]' '
/^#/ || /#FFFFFF/ {next}
!($2 in fb) {fb[$2]=$1}
END {r=$2;for(i=0;i<=r;i++){if(i in fb)print i,fb[i]; else print i,"-1"}}'
The -F[,: ] tells awk to split the words on the line by commas, colons or spaces - this helps me get at the row and column at the start of each line. The line with /^#/ skips the comment in the first line of ImageMagick text output and all lines that contain white or #FFFFFF.
Then, I have an array fb[] , indexed by image row, that holds the column of the first black pixel on each row. Each time I find a line with a row not in my array fb[], I save it in the array.
At the end, inside END{}, I run through fb[] printing all rows and indices of first black pixels in those rows. Note that I output -1 in place of any undefined elements (i.e. those with no non-white pixels) - thanks to #KurtPfeifle for the hint.

Related

Adding white line between text lines

I am trying to do OCR using Tesseract overall results seems acceptable. The images are very very long receipts and we are scanning using scanner, the quality is better. Only issue is that in receipts few characters are joint between two lines
Please see the attached sample image. You may see in the first line character 'p' and in the second line character M are joint. This is causing problem in OCR.
SO, the real question is may we add a white line or square between every text line ?
You can do that for this image in Imagemagick by trimming the image to remove surrounding white and adding the same amount of black. Then average that image down to one column and looking for the brightest row. I start and stop 4 pixels from the top and bottom to avoid any really bright rows in those regions. Once I find the brightest row, I splice in 4 rows of white between the top and bottom regions divided by that row. This is not the most elegant way. But it shows the potential. One could likely pipe the list of row values to AWK and search for the max value in more efficient manner than saving to an array and using a for loop. Unix syntax with Imagemagick.
Input:
max=0
row=0
arr=()
arr=(`convert text.png -fuzz 50% -trim -background black -flatten -colorspace gray -scale 1x! -depth 8 txt:- | tail -n +2 | sed -n 's/^.*gray[(]\(.*\)[)]$/\1/p'`)
num=${#arr[*]}
#echo "${arr[*]}"
for ((i=4; i<num-4; i++)); do
val="${arr[$i]}"
max=`convert xc: -format "%[fx:$val>$max?$val:$max]" info:`
row=`convert xc: -format "%[fx:$val==$max?$i:$row]" info:`
#echo "$i $val $max $row"
done
convert text.png -gravity north -splice 0x4+0+$row text2.png
If you want less space, you can change to -splice 0x1+0+$row, but it won't change much. It is not writing over your image, but inserting white between the existing rows.
But by doing the processing above, your OCR still may not recognize the p or M, since the bottom of the p is cut off and appended to the M.
If you have more than two lines of text, you will have to search the column for approximately evenly spaced maxima.

Find nearest point horizontal in imagemagick?

I'm trying to find the nearest point to the point (red in this case) in this image. In this image this output find the first line point from right
how I could do this
output
Please help me.
This looks fun! Let's dump the image to text using ImageMagick:
convert image.png txt:
# ImageMagick pixel enumeration: 337,218,65535,srgb
0,0: (65535,65535,65535) #FFFFFF white
1,0: (65535,65535,65535) #FFFFFF white
2,0: (65535,65535,65535) #FFFFFF white
3,0: (65535,65535,65535) #FFFFFF white
4,0: (65535,65535,65535) #FFFFFF white
...
...
221,79: (0,0,0) #000000 black
221,80: (0,0,0) #000000 black
221,81: (0,0,0) #000000 black
221,82: (0,0,0) #000000 black
...
...
Ok, now let's use awk to find all black pixels and print their (x,y) coordinates:
convert image.png txt: | awk -F'[,:]' '/black/{x=$1;y=$2;print x,y}'
221 79
221 80
221 81
221 82
221 83
221 84
...
...
Ok, now let's tell awk where the red pixel is by passing in rx (red x-coordinate) and ry (red y-coordinate). Then also, calculate the sum of the squares of the x-distance and y-distance from red to each black pixel. When it is less (i.e. nearer) than any seen so far, save the location. Print the nearest location at the end.
convert image.png txt: | awk -F'[,:]' -v rx=318 -v ry=127 '
BEGIN{m=999999}
/black/{
x=$1; y=$2; d2=(rx-x)*(rx-x)+(ry-y)*(ry-y)
if(d2<m){m=d2;xm=x;ym=y}
}
END{print xm,ym}'
277 127
So, that is the answer... (277,127). Let's check it by drawing a cyan circle there:
convert image.png -fill cyan -draw "circle 277,127 277,132" check.png
On re-reading the question, I note that you actually only want the horizontally closest point whereas my solution above caters for the general case in any direction. If you just want horizontal offset, and you know the horizontal line is at y-coordinate 127, you can just extract that specific row from the image and simplify things like this:
convert image.png -crop x1+0+127 txt: | awk -F'[,:]' -v rx=318 '
BEGIN{m=999999} /black/{x=$1;d=(rx-x)*(rx-x);if(d<m){m=d;xm=x}} END{print xm}'
277
If you don't like awk, you can just do it by eyeball...
convert image.png -crop x1+0+127 txt: | grep -E "black|red"
221,0: (0,0,0) #000000 black
277,0: (0,0,0) #000000 black <--- nearest black to red
314,0: (65535,0,0) #FF0000 red
315,0: (65535,0,0) #FF0000 red
316,0: (65535,0,0) #FF0000 red
317,0: (65535,0,0) #FF0000 red
318,0: (65535,0,0) #FF0000 red
319,0: (65535,0,0) #FF0000 red
320,0: (65535,0,0) #FF0000 red
How did I find the coordinates of the red pixel? I used ImageMagick's sub-image search looking for a red pixel like this:
compare -metric rmse -subimage-search -dissimilarity-threshold 1 image.png \( -size 1x1 xc:red \) null:
0 (0) # 317,121
Notes:
I just used the sum of the squares rather than the square root of the sum of the squares because it is computationally faster and the results are the same because it holds that if a>b, then a * a > b * b in this case.
I used slightly different rx and ry from those generated by the sub-image search because OP says he had the coordinates and the ones found by sub-image search don't find the exact centre of the rather large red blob, but instead the top-leftmost edge of the red blob.

Combining images that are "cut off" in ImageMagick?

I would like to combine 2 images, which are identical in width, but vary in height. They are identical on the bottom/top side, but it's unknown how much.
1) Identify identical parts
2) Combine the images so the identical parts match
Example:
Part 1: http://i.imgur.com/rZtAk2c.png
Part 2: http://i.imgur.com/CQaQbr8.png
1. Determine the image dimensions
Use identify to get width and height of each image:
identify \
http://i.imgur.com/rZtAk2c.png \
http://i.imgur.com/CQaQbr8.png
CQaQbr8.png PNG 701x974 720x994+10+0 8-bit sRGB 256c 33.9KB 0.000u 0:00.000
rZtAk2c.png PNG 701x723 720x773+10+46 8-bit sRGB 256c 25.6KB 0.000u 0:00.000
2. Interpret the results
The results from the above command are these:
Both images show 701 pixels wide rows.
One image shows 974 different rows.
The other image shows 723 different rows.
But both images use a different 'canvas' size.
The first image uses a 720x994 pixels canvas (offset of shown part is +10+0).
The second image uses a 720x773 pixels canvas (offset of shown part is +10+46).
3. Normalize the canvas to be identical with the shown pixels
We use the +repage image operator to normalize the canvas for both images:
convert CQaQbr8.png +repage img1.png
convert rZtAk2c.png +repage img2.png
4. Check both new images' dimensions again
identify img1.png img2.png
img1.png PNG 701x974 701x974+0+0 8-bit sRGB 256c 33.9KB 0.000u 0:00.000
img2.png PNG 701x723 701x723+0+0 8-bit sRGB 256c 25.5KB 0.000u 0:00.000
5. Learn, how to extract a single row from an image.
As an example, we extract row number 3 from img1.png (numbering starts with 0):
convert img1.png[701x1+0+3] +repage img1---row3.png
identify img---row3.png
img1---row3.png PNG 701x1 701x1+0+0 8-bit sRGB 256c 335B 0.000u 0:00.000
6. Learn, how to extract that same row in ImageMagick's 'txt' format:
convert img1.png[701x1+0+3] +repage img---row3.txt
If you are not familiar with the 'txt' format, here is an extract:
cat img---row3.txt
# ImageMagick pixel enumeration: 701,1,255,gray
0,0: (255,255,255) #FFFFFF gray(255)
1,0: (255,255,255) #FFFFFF gray(255)
2,0: (255,255,255) #FFFFFF gray(255)
3,0: (255,255,255) #FFFFFF gray(255)
4,0: (255,255,255) #FFFFFF gray(255)
5,0: (255,255,255) #FFFFFF gray(255)
6,0: (255,255,255) #FFFFFF gray(255)
7,0: (255,255,255) #FFFFFF gray(255)
8,0: (255,255,255) #FFFFFF gray(255)
9,0: (255,255,255) #FFFFFF gray(255)
[...skipping many lines...]
695,0: (255,255,255) #FFFFFF gray(255)
696,0: (255,255,255) #FFFFFF gray(255)
697,0: (255,255,255) #FFFFFF gray(255)
698,0: (255,255,255) #FFFFFF gray(255)
699,0: (255,255,255) #FFFFFF gray(255)
700,0: (255,255,255) #FFFFFF gray(255)
The 'txt' output file describes every pixel via a text line.
In each line the first column indicates the respective pixel's coordinates.
The second, third and fourth columns indicate the pixel's color in different ways (but they contain the same information each).
7. Convert each row into its 'txt' format and create its MD5 sum
This command also creates 'txt' output. But this time the 'target' file is given as txt:-. This means that the output is streamed to <stdout>.
for i in {0..973}; do \
convert img1.png[701x1+0+${i}] txt:- \
| md5sum > md5sum--img1--row${i}.md5 ; \
done
This command creates 974 different files containing the MD5 sum of the 'txt' representation for the respective rows.
We can also write all MD5 sums into a single file:
for i in {0..973}; do \
convert img1.png[701x1+0+${i}] txt:- \
| md5sum >> md5sum--img1--all-rows.md5 ; \
done
Now do the same thing for img2.png:
for i in {0..722}; do \
convert img2.png[701x1+0+${i}] txt:- \
| md5sum >> md5sum--img2--all-rows.md5 ; \
done
8. Use sdiff to determine which lines of the .md5 files match
We can use sdiff to compare the two .md5 files line by line and write the output to a log file. The nl -v 0 part of the following command automatically inserts the line number, starting with 0 into the result:
sdiff md5sum--img{1,2}--all-rows.md5 | nl -v 0 > md5sums.log
9. Check the md5sums.log for identical lines
cat md5sums.log
0 > 38c6cd70c39ffc853d1195a0da6474f8 -
1 > 85100351b390ace5a7caca11776666d5 -
2 > 66e2940dbb390e635eeba9a2944960dc -
3 > 8e93c1ed5c89aead8333f569cb768e4a -
4 > 8e93c1ed5c89aead8333f569cb768e4a -
[... skip many lines ...]
172 > f9fece874b60fa1af24516c4bcee7302 -
173 > edbe62592a3de60d18971dece07e3beb -
174 > 18a28776cc64ead860a99213644b0574 -
175 0d0753c587dc3c46078ac265895a3f6c - | 0d0753c587dc3c46078ac265895a3f6c -
176 5ecc2b5a61af4120151fed4cd2c3d305 - | 5ecc2b5a61af4120151fed4cd2c3d305 -
177 3f2857594fe410dc7fe42b4bef724a87 - | 3f2857594fe410dc7fe42b4bef724a87 -
178 2fade815d804b6af96550860602ec1ba - | 2fade815d804b6af96550860602ec1ba -
[... skip many lines ...]
719 127e6d52095db20f0bcb1fe6ff843da0 - | 127e6d52095db20f0bcb1fe6ff843da0 -
720 aef15dde4909e9c467f11a64198ba6d2 - | aef15dde4909e9c467f11a64198ba6d2 -
721 6320863dd7d747356f4b23fb7ba28a73 - | 6320863dd7d747356f4b23fb7ba28a73 -
722 2e32ceb7cc89d7bb038805e484dc7bc9 - | 2e32ceb7cc89d7bb038805e484dc7bc9 -
723 f9fece874b60fa1af24516c4bcee7302 - <
724 f9fece874b60fa1af24516c4bcee7302 - <
725 f9fece874b60fa1af24516c4bcee7302 - <
726 f9fece874b60fa1af24516c4bcee7302 - <
[... skip many lines ...]
1146 3e18a7db0aed8b6ac6a3467c6887b733 - <
1147 62866c8ef78cdcd88128b699794d93e6 - <
1148 7dbed48a0e083d03a6d731a6864d1172 - <
From this output we can conclude that rows 175 -- 722 in the sdiff-produced file all do match.
This means that there is a match in the following rows of the original images:
row 0 of img1.png matches row 175 of img2.png (begin of match).
img1.png has a total of 974 rows of pixels.
row 547 of img1.png matches row 722 of img2.png (end of match).
img2.png has a total of 723 rows of pixels.
(Remember, we used 0-based row numbering...)
10. Put it all together now
From above investigations we can conclude, that we need only the first 174 rows from img1.png and append the full img2.png below that in order to get the correct result:
convert img1.png[701x174+0+0] img2.png -append complete.png
NOTES:
There are many possible solutions (and methods to arrive there) to the problem posed by the OP. For example:
Instead of converting the rows to 'txt' format we could have used any other ImageMagick-supported format also (PNG, PPM, ...) and created the MD5 sums for comparison.
Instead of using -append to concatenate the two image parts, we could also have used -composite to superimpose them (with an appropriate offset, of course).
As #MarkSetchell says in his comment: instead of piping the 'pixel-rows' output to md5sum one could also use -format '%#' info:- to directly generate a hash value from the respective pixel-row. I had already forgotten about that option, because (years ago) I tried to use it for a similar purpose, and somehow it didn't work as I needed it. Which is why I became used to my 'piping to md5sum' approach...

Laser line detection opencv

I want to detect a laser line for an autonomous system.
My work till now:
1. I split the image in rgb channels
2. use only the red channel because of using a red laser line
3. get threshold value manually
4.searching the binary image for a value != 0
I can't threshold it manually for the use case of an automous system any ideas how to solve the problem ?
And only searching of the highest peak in an image isn't good enough because of incidence of sunlight.
Maybe I can search for short peaks..
Because in the region of the laser line the brightness increase fast and then decrease fast after the laser line.
How can I realize that in opencv?
Updated
Ok, I have had a look at your updated picture. My algorithm comes down to the following steps.
Find Brightest Column (i.e. laser line) in Image
Find Dark Gap in Brightest Column
Find Neighbouring Column That is Brightest in Gap in laser line
Step 1 - Find Brightest Column (i.e. laser line) in Image
The easiest way to do this is to squidge the image down so it is still its original width, but just one pixel high effectively averaging the pixels in each vertical column of the image. Then apply an -auto-level to contrast stretch that to the full range of 0-255 and threshold it at 95% to find all columns that are within 5% of the brightest. Then look for pixels that have thresholded out to white (#ffffff). This is one line in ImageMagick, as follows:
convert http://i.stack.imgur.com/1P1zj.jpg -colorspace gray \
-resize x1! \
-auto-level \
-threshold 95% text: | grep -i ffffff
Output:
297,0: (255,255,255) #FFFFFF white
298,0: (255,255,255) #FFFFFF white
299,0: (255,255,255) #FFFFFF white
So, I now know that columns 297-299 are the the ones where the laser line is. Note that if the picture is slightly rotated, or the laser is not vertical, the bright column will be split across multiple columns. To counteract this, you could shrink the width of the image by a factor of two or three so that adjacent columns tend to get merged into one in the smaller image, then just multiply up the column by the shrink factor to find the original position.
That completes Step 1, but an alternative method follows before Step 2.
I split the image into columns 1 pixel wide with:
convert input.png -crop 1x +repage line%d.png
Now I find the brightest column (one with highest mean brightness) with:
for f in line*; do m=$(convert -format "%[fx:mean]" $f info:);echo $m:$f ;done | sort -g
which gives this
...
...
0.559298:line180.png
0.561051:line185.png
0.561337:line306.png
0.562527:line184.png
0.562939:line183.png
0.584523:line295.png
0.590632:line299.png
0.644543:line296.png
0.671116:line298.png
0.71122:line297.png <--- brightest column = 297
Step 2 - Find Dark Gap in Brightest Column
Now I take column 297 and auto-level it so the darkest part becomes zero and the lightest part becomes white, then I negate it.
convert line297.png -colorspace gray -auto-level -threshold 20% -negate txt:
...
0,100: (0,0,0) #000000 black
0,101: (0,0,0) #000000 black
0,102: (0,0,0) #000000 black
0,103: (0,0,0) #000000 black
0,104: (0,0,0) #000000 black
0,105: (0,0,0) #000000 black
0,106: (0,0,0) #000000 black
0,107: (0,0,0) #000000 black
0,108: (255,255,255) #FFFFFF white <- gap in laser line
0,109: (255,255,255) #FFFFFF white <- gap in laser line
0,110: (255,255,255) #FFFFFF white <- gap in laser line
0,111: (255,255,255) #FFFFFF white <- gap in laser line
0,112: (0,0,0) #000000 black
0,113: (0,0,0) #000000 black
...
0,478: (0,0,0) #000000 black
0,479: (0,0,0) #000000 black
Step 3 - Find Neighbouring Column That is Brightest in Gap in laser line
Now if I multiply this column with each of the columns either side of it, all parts of the other columns that are not in the gap in the laser line will become zero and all parts that are in the gap in the laser line will be multiplied and totalled up as I run through the columns either side of column 297.
So, I check columns 240 to 340, multiplying each column with the mask from the previous step and seeing which one is brightest in the gap in the laser line:
for i in {240..340} ;do n=$(convert line${i}.png mask.png -compose multiply -composite -format "%[mean]" info:);echo $n:$i ;done | sort -g
The output is as follows:
458.495:248
466.169:249
468.668:247
498.294:260
502.756:250
536.844:259
557.726:258
564.508:251
624.117:252
627.508:253 <--- column 253 is brightest
Then I can see that column 253 is the brightest in the area where the laser line is darkest. So the displaced line is in column 253.
I am sure this technique could be done fairly easily in opencv.
Original Answer
I can tell you a way to do it, but not give you any code for opencv as I tend to use ImageMagick. I split the image into a series of vertical images, each 1 pixel wide - i.e. single pixel columns. Then I get the average of the brightnesses in all columns and can immediately see the brightest column. It works pretty well, here is how I tested the algorithm:
Split image into single pixel columns
convert http://i.stack.imgur.com/vMiU1.jpg -crop 1x +repage line%04d.png
See what we got:
ls line*
line0000.png line0128.png line0256.png line0384.png line0512.png
line0001.png line0129.png line0257.png line0385.png line0513.png
...
line0126.png line0254.png line0382.png line0510.png line0638.png
line0127.png line0255.png line0383.png line0511.png line0639.png
Yes, 640 vertical lines. Check size of one...
identify line0639.png
line0639.png PNG 1x480 1x480+0+0 8-bit sRGB 1.33KB 0.000u 0:00.000
Yes, it's 1 pixel wide and 480 pixels high.
Now get mean brightness of all lines and sort by brightness:
for f in line*; do m=$(convert -format "%[fx:mean]" $f info:);echo $m:$f ;done | sort -g
Output
0.5151:line0103.png
0.521621:line0104.png
0.527829:line0360.png
0.54699:line0356.png
0.567822:line0355.png
0.752827:line0358.png <--- highest brightness
0.76616:line0357.png <--- highest brightness
Columns 357 and 358 seem to be readily identifiable as your answer.

imagemagick detect coordinates of transparent areas

I have a PNG image with transparent areas, squares/rectangle areas that contains transparency.
I would like to know if there is some way that i can know the top,lef,width,height of theses transparent areas in the image.
Thanks for any help
Updated Answer
In the intervening years, I have come across a simpler solution, so I thought I would update it so anyone else seeing it can get the benefit of the latest and greatest.
Start off just the same, by extracting the alpha layer to an image of its own and invert it:
convert start.png -alpha extract -negate intermediate.png
Now perform a "Connected Component Analysis" on that:
convert start.png -alpha extract -negate \
-define connected-components:verbose=true \
-define connected-components:area-threshold=100 \
-connected-components 8 -auto-level result.png
Objects (id: bounding-box centroid area mean-color):
0: 256x256+0+0 128.7,130.4 62740 srgb(0,0,0)
3: 146x8+103+65 175.5,68.5 1168 srgb(255,255,255)
2: 9x93+29+42 33.0,88.0 837 srgb(255,255,255)
1: 113x7+4+21 60.0,24.0 791 srgb(255,255,255)
You will see there is a header line and 4 lines of output and each has a colour at the end, the first line is black, and corresponds to the entire shape, and the the last three are white, corresponding to the three transparent areas. It is basically the second field on each of the last three lines that you want. So, 146x8+103+65 means a box 146px wide by 103px tall offset 103px to the right of the top-left corner, and 65px down from the top-left corner.
If I draw those in, in red, you can see what it has identified:
convert result.png -stroke red -fill none -strokewidth 1 \
-draw "rectangle 103,65 249,73" \
-draw "rectangle 29,42 38,135" \
-draw "rectangle 4,21 117,28" result.png
Original Answer
The following may help you get to an answer but I have not developed it all the way through to completion - people often ask questions and then never log in again and there is quite a lot of effort involved...
Let's start with this input image - where the white areas are transparent:
You can extract the alpha channel from an image with ImageMagick like this:
convert input.png -alpha extract -negate alpha.png
which gives this, where white areas are transparent
Ok, one approach is to find the bounding box of the white areas, you can do this with trim and it will give you the bounding box that encloses the white areas:
convert input.png -alpha extract -format "%#" info:
245x114+4+21
So the bounding box is 245px wide and 114px high starting at offset +4+21 from top left. I can draw that on the image to show it:
So, that is a start.
Also, you can get ImageMagick to enumerate the pixels in text format, so you can run this command
convert input.png -alpha extract -negate txt: | more
# ImageMagick pixel enumeration: 256,256,255,gray
0,0: (0,0,0) #000000 gray(0)
1,0: (0,0,0) #000000 gray(0)
2,0: (0,0,0) #000000 gray(0)
which tells you that the image is 256x256 and that the first 3 pixels are all black. If you want the white ones (i.e. transparent ones) you can do this:
convert input.png -alpha extract -negate txt: | grep FFFFFF | more
4,21: (255,255,255) #FFFFFF gray(255)
5,21: (255,255,255) #FFFFFF gray(255)
6,21: (255,255,255) #FFFFFF gray(255)
7,21: (255,255,255) #FFFFFF gray(255)
This tells you that pixel 4,21 is the top left corner of your transparent area - I'm glad it matches the output from the bounding box method above :-)
So, you can easily get a list of all the pixels that are transparent. This approach could be developed, or something similar coded up in Ruby (RMagick) to find contiguous areas of black - but that is beyond the scope of this answer for the moment - as I am not a Ruby programmer :-)
Ok, I have learned some Ruby this afternoon and, no laughter please, this is my first Ruby program. It is probably pretty ugly and more like Perl or C (my preferred languages) but it works and finds the rectangular transparent areas.
#!/usr/bin/ruby
require 'RMagick'
include Magick
infile=ARGV[0]
img = ImageList.new(infile)
w=img.columns
h=img.rows
#Extract alpha channel into pixel array
px=img.export_pixels(0,0,w,h,"A")
for row in 0..h-1
for col in 0..w-1
thispx=px[w*row+col]
if thispx<32768 then
a=row
b=col
# Find extent (c) of rectangle towards right
for r in col..w-1
thispx=px[w*row+r]
if thispx<32768
c=r
else
break
end
end
# Find extent (d) of rectangle towards bottom
for s in row..h-1
thispx=px[w*s+col]
if thispx<32768
d=s
else
break
end
end
# Blank this rectangle as we have located it
for r in row..d
for s in col..c
px[w*r+s]=65535
end
end
# Tell caller about this rectangle
printf "%d,%d %d,%d\n",a,b,d,c
end
end
end
Run it like this:
bounds.rb input.png

Resources