Let say I have this input image, with any number of boxes. I want to segment out these boxes, so I can eventually extract them out.
input image:
The background could anything that is continuous, like a painted wall, wooden table, carpet.
My idea was that the gradient would be the same throughout the background, and with a constant gradient. I could turn where the gradient is about the same, into zero's in the image.
Through edge detection, I would dilate and fill the regions where edges detected. Essentially my goal is to make a blob of the areas where the boxes are. Having the blobs, I would know the exact location of the boxes, thus being able to crop out the boxes from the input image.
So in this case, I should be able to have four blobs, and then I would be able to crop out four images from the input image.
This is how far I got:
segmented image:
query = imread('AllFour.jpg');
gray = rgb2gray(query);
[~, threshold] = edge(gray, 'sobel');
weightedFactor = 1.5;
BWs = edge(gray,'roberts');
%figure, imshow(BWs), title('binary gradient mask');
se90 = strel('disk', 30);
se0 = strel('square', 3);
BWsdil = imdilate(BWs, [se90]);
%figure, imshow(BWsdil), title('dilated gradient mask');
BWdfill = imfill(BWsdil, 'holes');
figure, imshow(BWdfill);
title('binary image with filled holes');
What a very interesting problem! Here's my solution in an attempt to solve this problem for you. This is assuming that the background has the same colour distribution throughout. First, transform your image from RGB to the HSV colour space with rgb2hsv. The HSV colour space is an ideal transform for analyzing colours. After this, I would look at the saturation and value planes. Saturation is concerned with how "pure" the colour is, while value is the intensity or brightness of the colour itself. If you take a look at the saturation and value planes for the image, this is what is shown:
im = imread('http://i.stack.imgur.com/1SGVm.jpg');
out = rgb2hsv(im);
figure;
subplot(2,1,1);
imshow(out(:,:,2));
subplot(2,1,2);
imshow(out(:,:,3));
This is what I get:
By taking a look at some locations in the gray background, it looks like the majority of the saturation are less than 0.2 as well as the elements in the value plane are greater than 0.3. As such, we want to find the opposite of those pixels to get our objects. As such, we find those pixels whose saturation is greater than 0.2 or those pixels with a value that is less than 0.3:
seg = out(:,:,2) > 0.2 | out(:,:,3) < 0.3;
This is what we get:
Almost there! There are some spurious single pixels, so I'm going to perform an opening with imopen with a line structuring element.
After this, I'll perform a dilation with imdilate to close any gaps, then use imfill with the 'holes' option to fill in the gaps, then use erosion with imerode to shrink the shapes back to their original form. As such:
se = strel('line', 3, 90);
pre = imopen(seg, c);
se = strel('square', 20);
pre2 = imdilate(pre, se);
pre3 = imfill(pre2, 'holes');
final = imerode(pre3, se);
figure;
imshow(final);
final contains the segmented image with the 4 candy boxes. This is what I get:
Try resizing the image. When you make it smaller, it would be easier to join edges. I tried what's shown below. You might have to tune it depending on the nature of the background.
close all;
clear all;
im = imread('1SGVm.jpg');
small = imresize(im, .25); % resize
grad = (double(imdilate(small, ones(3))) - double(small)); % extract edges
gradSum = sum(grad, 3);
bw = edge(gradSum, 'Canny');
joined = imdilate(bw, ones(3)); % join edges
filled = imfill(joined, 'holes');
filled = imerode(filled, ones(3));
imshow(label2rgb(bwlabel(filled))) % label the regions and show
If you have a recent version of MATLAB, try the Color Thresholder app in the image processing toolbox. It lets you interactively play with different color spaces, to see which one can give you the best segmentation.
If your candy covers are fixed or you know all the covers that are coming into the scene then Template matching is best for this. As it is independent of the background in the image.
http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
Related
I'm trying to remove the grid lines in handwriting picture. I tried to use FFT to extract the grid pattern and remove it (this is from an answer in the original question, which is closed somehow. It has more background as well.). This image shows what I am able to get currently (Illustration result):
The first line is a real image with handwriting character. Since it's taken by phone in various conditions (light, direction, etc.), the grid line might not be perfect horizontal/vertical, and the color of grid line also varies and might be close the the color of characters. I turn it to grayscale, apply fft, and use tries to use thresholding to extract the patterns (in red rectangle, the illustration is using OTSU). Then I mask the image with the thresholding pattern, and use ifft to get the result. It fails on the real image obviously.
The second line is a real image of blank grid w/o handwriting character. From this, I think 3 lines (vertical and horizontal) in the center are the patterns I care.
The third line is a synthetic image w/ perfect grid lines. It's just for reference. And after applying the same algorithm, the grid lines could be removed successfully.
The fourth line is a synthetic image w/ perfect dashed grid lines, which is closer to the grid lines on real handwriting practice paper. It's also for reference. It shows the pattern of dashed lines are actually more complicated than 3 lines in the center. With the same algorithm, the grid lines could be removed almost completely as well.
The code I use is:
def FFTCV(img):
util.Plot(img, 'Input')
print(img.shape)
if len(img.shape) == 3 and img.shape[2] == 3:
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
util.Plot(img, 'Gray')
dft = cv.dft(np.float32(img),flags = cv.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)
util.Plot(cv.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]), 'fft shift')
magnitude_spectrum = np.uint8(20*np.log(cv.magnitude(dft_shift[:,:,0],dft_shift[:,:,1])))
util.Plot(magnitude_spectrum, 'Magnitude')
_, threshold = cv.threshold(magnitude_spectrum, 0, 1, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)
# threshold = cv.adaptiveThreshold(
# magnitude_spectrum, 1, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY_INV, 11, 10)
# magnitude_spectrum, 1, cv.ADAPTIVE_THRESH_GAUSSIAN_C, cv.THRESH_BINARY_INV, 11, 10)
util.Plot(threshold, 'Threshold Mask')
fshift = dft_shift * threshold[:, :, None]
util.Plot(cv.magnitude(fshift[:,:,0],fshift[:,:,1]), 'fft shift Masked')
magnitude_spectrum = np.uint8(20*np.log(cv.magnitude(fshift[:,:,0],fshift[:,:,1])))
util.Plot(magnitude_spectrum, 'Magnitude Masked')
f_ishift = np.fft.ifftshift(fshift)
img_back = cv.idft(f_ishift)
img_back = cv.magnitude(img_back[:,:,0],img_back[:,:,1])
util.Plot(img_back, 'Back')
So I'd like to learn suggestions on how to extract the patterns for real images. Thanks very much.
I am trying to crop a specific part of a frame in opencv to get a cropped image of the detections from mobilenet ssd model. The code to crop the image is like this
for box_id in boxes_ids:
x,y,w,h,id = box_id
crop=frame[y:h,x:w]
cv2.imshow("d",crop)
cv2.waitKey(5)
This code is producing a blank space towards the right of all the images that I extract :
Please tell me how can i fix this.
try using Pillow that helps
def trim(im, color):
bg = Image.new(im.mode, im.size, color)
diff = ImageChops.difference(im, bg)
diff = ImageChops.add(diff, diff)
bbox = diff.getbbox()
if bbox:
return im.crop(bbox)
This function will probably take it out, just be carefull that this will only work if the segment of image has consistent pixels
as said before in the comments, there is a minimum window width, and smaller crops will be drawn on some neutral background.
but maybe it's more intuitive to draw the crop into an empty image, conserving its original position:
for box_id in boxes_ids:
x,y,w,h,id = box_id
draw = np.zeros(frame.shape, np.uint8)
draw[y:h,x:w] = frame[y:h,x:w]
cv2.imshow("d",draw)
cv2.waitKey(5)
I am using an object detection machine learning model (only 1 object). It working well in case there are a few objects in image. But, if my image has more than 300 objects, it can't recognize anything. So, I want to divide it into two parts or four parts without crossing any object.
I used threshold otsu and get this threshold otsu image. Actually I want to divide my image by this line expect image. I think my model will work well if make predictions in each part of image.
I tried to use findContour, and find contourArea bigger than a half image area, draw it into new image, get remain part and draw into another image. But most of contour area can't reach 1/10 image area. It is not a good solution.
I thought about how to detect a line touch two boundaries (top and bottom), how can I do it?
Any suggestion is appreciate. Thanks so much.
Since your region of interests are separated already, you can use connectedComponents to get the bounding boxes of these regions. My approach is below.
img = cv2.imread('circles.png',0)
img = img[20:,20:] # remove the connecting lines on the top and the left sides
_, img = cv2.threshold(img,0,1,cv2.THRESH_BINARY)
labels,stats= cv2.connectedComponentsWithStats(img,connectivity=8)[1:3]
plt.imshow(labels,'tab10')
plt.show()
As you can see, two regions of interests have different labels. All we need to do is to get the bounding boxes of these regions. But first, we have to get the indices of the regions. For this, we can use the size of the areas, because after the background (blue), they have the largest areas.
areas = stats[1:,cv2.CC_STAT_AREA] # the first index is always for the background, we do not need that, so remove the background index
roi_indices = np.flip(np.argsort(areas))[0:2] # this will give you the indices of two largest labels in the image, which are your ROIs
# Coordinates of bounding boxes
left = stats[1:,cv2.CC_STAT_LEFT]
top = stats[1:,cv2.CC_STAT_TOP]
width = stats[1:,cv2.CC_STAT_WIDTH]
height = stats[1:,cv2.CC_STAT_HEIGHT]
for i in range(2):
roi_ind = roi_indices[i]
roi = labels==roi_ind+1
roi_top = top[roi_ind]
roi_bottom = roi_top+height[roi_ind]
roi_left = left[roi_ind]
roi_right = roi_left+width[roi_ind]
roi = roi[roi_top:roi_bottom,roi_left:roi_right]
plt.imshow(roi,'gray')
plt.show()
For your information, my method is only valid for 2 regions. In order to split into 4 regions, you would need some other approach.
I am working on some leaf images using OpenCV (Java). The leaves are captured on a white paper and some has shadows like this one:
Of course, it's somehow the extreme case (there are milder shadows).
Now, I want to threshold the leaf and also remove the shadow (while reserving the leaf's details).
My current flow is this:
1) Converting to HSV and extracting the Saturation channel:
Imgproc.cvtColor(colorMat, colorMat, Imgproc.COLOR_RGB2HSV);
ArrayList<Mat> channels = new ArrayList<Mat>();
Core.split(colorMat, channels);
satImg = channels.get(1);
2) De-noising (median) and applying adaptiveThreshold:
Imgproc.medianBlur(satImg , satImg , 11);
Imgproc.adaptiveThreshold(satImg , satImg , 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY, 401, -10);
And the result is this:
It looks OK, but the shadow is causing some anomalies along the left boundary. Also, I have this feeling that I am not using the white background to my benefit.
Now, I have 2 questions:
1) How can I improve the result and get rid of the shadow?
2) Can I get good results without working on saturation channel?. The reason I ask is that on most of my images, working on L channel (from HLS) gives way better results (apart from the shadow, of course).
Update: Using the Hue channel makes threshdolding better, but makes the shadow situation worse:
Update2: In some cases, the assumption that the shadow is darker than the leaf doesn't always hold. So, working on intensities won't help. I'm looking more toward a color channels approach.
I don't use opencv, instead I was trying to use matlab image processing toolbox to extract the leaf. Hopefully opencv has all the processing functions for you. Please see my result below. I did all the operations in your original image channel 3 and channel 1.
First I used your channel 3, threshold it with 100 (left top). Then I remove the regions on the border and regions with the pixel size smaller than 100, filling in the hole in the leaf, the result is shown in right top.
Next I used your channel 1, did the same thing as I did in channel 3, the result is shown in left bottom. Then I found out the connected regions (there are only two as you can see in the left bottom figure), remove the one with smaller area (shown in right bottom).
Suppose the right top image is I1, and the right bottom image is I, the leaf is extracted by implement ~I && I1. The leaf is:
Hope it helps. Thanks
I tried two different things:
1. other thresholding on the saturation channel
2. try to find two contours: shadow and leaf
I use c++ so your code snippets will look a little different.
trying otsu-thresholding instead of adaptive thresholding:
cv::threshold(hsv_imgs,mask,0,255,CV_THRESH_BINARY|CV_THRESH_OTSU);
leading to following images (just OTSU thresholding on saturation channel):
the other thing is computing gradient information (i used sobel, see oppenCV documentation), thresholding that and after an opening-operator I used findContours giving something like this, not useable yet (gradient contour approach):
I'm trying to do the same thing with photos of butterflies, but with more uneven and unpredictable backgrounds such as this. Once you've identified a good portion of the background (e.g. via thresholding, or as we do, flood filling from random points), what works well is to use the GrabCut algorithm to get all those bits you might miss on the initial pass. In python, assuming you still want to identify an initial area of background by thresholding on the saturation channel, try something like
import cv2
import numpy as np
img = cv2.imread("leaf.jpg")
sat = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)[:,:,1]
sat = cv2.medianBlur(sat, 11)
thresh = cv2.adaptiveThreshold(sat , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh.jpg", thresh)
h, w = img.shape[:2]
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
grabcut_mask = thresh/255*3 #background should be 0, probable foreground = 3
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("GrabCut1.jpg", img*grabcut_mask[...,None])
This actually gets rid of the shadows for you in this case, because the edge of the shadow actually has high saturation levels, so is included in the grab cut deletion. (I would post images, but don't have enough reputation)
Usually, however, you can't trust shadows to be included in the background detection. In this case you probably want to compare areas in the image with colour of the now-known background using the chromacity distortion measure proposed by Horprasert et. al. (1999) in "A Statistical Approach for Real-time Robust Background Subtraction and Shadow Detection". This measure takes account of the fact that for desaturated colours, hue is not a relevant measure.
Note that the pdf of the preprint you find online has a mistake (no + signs) in equation 6. You can use the version re-quoted in Rodriguez-Gomez et al (2012), equations 1 & 2. Or you can use my python code below:
def brightness_distortion(I, mu, sigma):
return np.sum(I*mu/sigma**2, axis=-1) / np.sum((mu/sigma)**2, axis=-1)
def chromacity_distortion(I, mu, sigma):
alpha = brightness_distortion(I, mu, sigma)[...,None]
return np.sqrt(np.sum(((I - alpha * mu)/sigma)**2, axis=-1))
You can feed the known background mean & stdev as the last two parameters of the chromacity_distortion function, and the RGB pixel image as the first parameter, which should show you that the shadow is basically the same chromacity as the background, and very different from the leaf. In the code below, I've then thresholded on chromacity, and done another grabcut pass. This works to remove the shadow even if the first grabcut pass doesn't (e.g. if you originally thresholded on hue)
mean, stdev = cv2.meanStdDev(img, mask = 255-thresh)
mean = mean.ravel() #bizarrely, meanStdDev returns an array of size [3,1], not [3], so flatten it
stdev = stdev.ravel()
chrom = chromacity_distortion(img, mean, stdev)
chrom255 = cv2.normalize(chrom, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX).astype(np.uint8)[:,:,None]
cv2.imwrite("ChromacityDistortionFromBackground.jpg", chrom255)
thresh2 = cv2.adaptiveThreshold(chrom255 , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh2.jpg", thresh2)
grabcut_mask[...] = 3
grabcut_mask[thresh==0] = 0 #where thresh == 0, definitely background, set to 0
grabcut_mask[np.logical_and(thresh == 255, thresh2 == 0)] = 2 #could try setting this to 2 or 0
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("final_leaf.jpg", grabcut_mask[...,None]*img)
I'm afraid with the parameters I tried, this still removes the stalk, though. I think that's because GrabCut thinks that it looks a similar colour to the shadows. Let me know if you find a way to keep it.
I have a bunch of uncompressed bitonal TIF document images. All of them have a watermark in the middle. When I run them through OCR, the text that overlaps with the watermark does not get recognized. I am trying to see if I can apply some type of cleanup to remove those watermarks to be able to recognize the missing text.
Again, the images are black and white, but when you look at the watermark it appears grey since it has a pattern of black and white pixels that makes the letters in the watermark less "dense" than regular text. At the same time, the watermark letters are very big, much bigger than the regular text.
An example of a somewhat similar image is this (except this one is color and the watermark characters in my case are a lot thicker and bigger; my watermarks are also a lot shorter: only 3 to 4 letters long)
It seems that there might be some sort of clean up filter that would be similar to removing large black borders from an image except borders are ually "denser" than a watermark so they appear "more black".
I have 3 tools at my disposal: GIMP, ImageMagick and IrfanView. Can you recommend any specific features of any subset of these tools that might help me?
Playing with contrast etc did not help, but I found a different way. As stated above, the regular text is a lot "denser" than the watermark text meaning that a regular black pixel has more surrounding black pixels than a watermark black pixel. So I devised a simple window-based filtering and thresholding algorithm.
Here's how I did it in Matlab, using a 5X5 window:
im=imread('imageWithWmark.tif');
imInv = ~im;
nr=size(imInv,1);
nc=size(imInv,2);
d = 2; % for 5X5 window
counts = zeros(nr,nc);
for rr = d+1 : nr-d-1
for cc = d+1 : nc-d-1
counts(rr,cc) = nnz(imInv(rr-d:rr+d,cc-d:cc+d));
end
end
thresh=10; % 10 out of 25 -- the larger the thresh the thinner the resulting letters are
imThresh = (counts>=thresh) & imInv;
imwrite(~imThresh,sprintf('Thresh_%d.tif',thresh),'Compression','none','Resolution',300);
Of course, the size of the window, the threshold and other parameters depend on the parameters of the regular text on the page (letter bigger/smaller, thicker/thinner etc) but even this initial version worked pretty well