EmguCV Cut Face+Neck Skin Only And Save New Image - opencv

In my app, I will input a human image and I want to get the face and neck only of that person as output in separate image. Example: Below image as input:(Source:http://www.fremantlepress.com.au)
And I want to get the up image as output:
I want to perform the following algorithm:
1. Detect face
2. Select (face region * 2) area
3. Detect skin and neck
4. Cut the skin region of the selected image
5. Save that cut region into a new image
As going through the EmguCV wiki and other online resources, I am confident to perform the step 1 and 2. But I am not sure how can I accomplish step 3 and 4.
There are some functions/methods I am looking on (Cunny Edge Detection, Contour etc) but I am not sure how and where should I apply those methods.
I am using EmguCV (C#) and Windows Form Application.
Please help me how can I do step 3 and 4. I will be glad if someone elaborate these two steps and some code also.

Well there are several ways you could approach this. Edge detection will only give you a binary image of edges and you will have to perform some line tracing or Hough transforms to detect the location of these. There accuracy will vary.
I will assume for know that you can detect the eyes and the relative location of the face. I would expect a statistical filter would provide a favourable outcome with better performance than a neural network which is the best alternative. A good Alternative is naturally colour segmentation if colour images are used (This is far easier to implement). I will also assume that the head position can change slightly with the neck being more or less visible within an image.
So for a Statistical Filter:
(Note that the background of the individual is similar to the face data when dealing with a greyscale image so a colour image would be better to work with).
Take a blank copy of our original image. We will form a binary map of our face on this while not
necessary it will allow us to examine our success easier
Find the Face, Eyes and Mouth in the original image.
We can assume that any data from the eyes and mouth form part of the face and mark these on the
blank copy with "1"s.
Now we need a bit of maths, as we know the face detection algorithm can only detect a face at a
certain angle to the camera. We use this and select a statistical mask from the image of certain
parts from the image let’s say 10x10 pixels 2 or 3 from the cheek area. This will be the most
likely area of the face within the image. We use this data and get values from the image such as
mean and standard deviation.
We now scan across the segmented part of the image where we have detected the face. We won't do
the whole image as this will take a long period of time. (Note: There is a border half the size
of the mask that won't be looked at). We examine each pixel and it surrounding neighbours to the
size of the 10x10 mask. If the average or standard deviation (whatever we are examining) is
similar to that of our filter say within 10% then we mark this pixel in our blank copy as a "1"
and consider that pixel to belong to the skin.
As for Colour Segmentation:
(Note: You could also try this process for greyscale however it will be less successful due to the brickwork)
Repeat steps 1 to 2.
Again we will select certain areas of the image that we can expect to contain face data (i.e. 10
pixels below the eye). In this case however we examine the data that forms the colour of this
pixel. Don't forget HSV images can obtain better results from this process an a combination more
so. We can the scan across the image examining each pixel for a similar colour. If it matches
mark it on your binary map.
An alternative is subtracting or adding a calculated from the R G and B spectrum of the image of
which only the data face will survive. You can convert this directly to a binary image by
making any value > 1 == 1;
This will only work for Skin as for the hair we will need other filters. A few notes:
A statistical filter working on a colour image has a far greater ability however takes longer.
Use data from the image to form your statistical filter as this will allow for other skin colours to be classified. A mathematical designed filter or colour segmentation will require a lot of work to achieve the same variability.
The size of the mask is important the greater the mask size the less likely errors will occur but again processing time increases.
You can speed up the process by referencing the same area within the binary map copy if the pixel your examining is already a 1 (classified by eye/nose/mouth detection) then why examine it again just skip it.
Multiple skin filters will provide better results however may also introduce more noise and remember each filter must then by compared with a pixel increasing processing time.
To get an lgorithm working accuratley will require a bit of trial and error but you sould see comparable results fairly quickly using these methods.
I hope this helps you on your way. Sorry for not including any code but hopefully others can help you were you get stuck and writing it yourself will help you understand what is going on and allow you to cut down on processing time. Let me know if you require any additional advice I'm doing my PhD in image analysis just so you know that the advice is sound.
Take Care
Chris
[EDIT]
Some quick results:
Here is a 20x20 filter applied in detecting the hair. The program I've written only works on greyscale images at the moment so the skin detection suffers interference from the stone (see later)
Colour Image of Face Region
Binary Map of Average Hair Filter 20x20 Mask 40% Error allowed
As can be observed there is interference from the shirt in this case as it matches the colour of the hair. This can be eliminated by simply only examining the top third or half of the detected facial region.
Binary Map of Average Skin Filter 20x20 Mask 40% Error allowed
In this image I use only 1 filter formed from the chin area as the stubble obviously changes the filters behaviour. There is still noise presented from the stone behind the individual however using colour image could eliminate this. The gaps in the case could be filled by an algorithm or another filter. Again there is noise from the edge of the shirt but we could minimise this either by detecting the shirt and removing any data that forms it or dimply only looking in certain areas.
Examples of the Regions to Inspect
To eliminate false classification you could take the top two thirds of the segmented image and look for the face and the width of the detected eyes to the bottom of the facial region for neck data.
Cheers Again
Chris

Hello Chris Can you share the codes for the same. Actually I have used grabcut algorithm to crop the face upto neck but the accuracy of images is not perfect. I am sharing the code where i am using webcam to capture images and then blurring the background and using grabcut algorithm. Please check it and reply.
import numpy as np
import cv2
import pixellib
from pixellib.tune_bg import alter_bg
rect = (0,0,0,0)
startPoint = False
endPoint = False
img_counter = 0
# function for mouse callback
def on_mouse(event,x,y,flags,params):
global rect,startPoint,endPoint
# get mouse click
if event == cv2.EVENT_LBUTTONDOWN:
if startPoint == True and endPoint == True:
startPoint = False
endPoint = False
rect = (0, 0, 0, 0)
if startPoint == False:
rect = (x, y, 0, 0)
startPoint = True
elif endPoint == False:
rect = (rect[0], rect[1], x, y)
endPoint = True
#cap = cv2.VideoCapture("YourVideoFile.mp4")
#cap = cv2.imread("/home/mongoose/Projects/background removal/bg_grabcut/GrabCut-from-video-master/IMG_6471.jpg")
#capturing the camera feed, '0' denotes the first camera connected to the computer
cap = cv2.VideoCapture(0)
waitTime = 50
change_bg = alter_bg(model_type = "pb")
change_bg.load_pascalvoc_model("/home/mongoose/Projects/background removal/bg_grabcut/test/xception_pascalvoc.pb")
change_bg.blur_camera(cap, extreme = True, frames_per_second= 10, output_video_name= "output_video.mp4", show_frames= True, frame_name= "frame", detect = "person")
#Reading the first frame
(grabbed, frame) = cap.read()
while(cap.isOpened()):
(grabbed, frame) = cap.read()
cv2.namedWindow('frame')
cv2.setMouseCallback('frame', on_mouse)
#drawing rectangle
if startPoint == True and endPoint == True:
cv2.rectangle(frame, (rect[0], rect[1]), (rect[2], rect[3]), (255, 0, 255), 2)
if not grabbed:
break
cv2.imshow('frame',frame)
key = cv2.waitKey(waitTime)
if key == ord('q'):
#esc pressed
break
elif key % 256 == 32:
# SPACE pressed
alpha = 1 # Transparency factor.
img_name = "opencv_frame_{}.png".format(img_counter)
imgCopy = frame.copy()
img = frame
mask = np.zeros(img.shape[:2], np.uint8)
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
w = abs(rect[0]-rect[2]+10)
h= abs(rect[1]-rect[3]+10)
rect2 = (rect[0]+10, rect[1]+10,w ,h )
cv2.grabCut(img, mask, rect2, bgdModel, fgdModel, 100, cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask == 2) | (mask == 0), 0, 1).astype('uint8')
img = img * mask2[:, :, np.newaxis]
cv2.imwrite(img_name, img )
print("{} written!".format(img_name))
img_counter += 1
cap.release()
cv2.destroyAllWindows()

Related

Robust estimation of volume of transparent liquid using image processing

I'm working on a project which involves determining the volume of a transparent liquid (or air if it proves easier) in a confined space.
The images I'm working with are a background image of the container without any liquid and a foreground image which may be also be empty in rare cases, but most times is partly filled with some amount of liquid.
While it may seem like a pretty straightforward smooth and threshold approach, it proves somewhat more difficult.
I'm working with a set with tons of these image pairs of background and foreground images, and I can't seem to find an approach that is robust enough to be applied to all images in the set.
My work so far involves smoothing and thresholding the image and applying closing to wrap it up.
bg_image = cv.imread("bg_image", 0)
fg_image = cv.imread("fg_image", 0)
blur_fg = cv.GaussianBlur(fg_image, (5, 5), sigmaX=0, sigmaY=0)
thresholded_image = cv.threshold(blur_fg, 186, 255, cv.THRESH_BINARY_INV)[1]
kernel = np.ones((4,2),np.uint8)
closing = cv.morphologyEx(thresholded_image, cv.MORPH_CLOSE, kernel)
The results vary, here is an example when it goes well:
In other examples, it doesn't go as well:
Aside from that, I have also tried:
Subtraction of the background and foreground images
Contrast stretching
Histogram equalization
Other thresholding techniques such as Otsu
The main issue is that the pixel intensities in air and liquid sometime overlap (and pretty low contrast in general), causing inaccurate estimations. I am leaning towards utilizing the edge that occurs between the liquid and air but I'm not really sure how..
I don't want to overflow with information here so I'm leaving it at that. I am grateful for any suggestions and can provide more information if necessary.
EDIT:
Here are some sample images to play around with.
Here is an approach whereby you calculate the mean of each column of pixels in your image, then calculate the gradient of the means:
#!/usr/bin/env python3
import cv2
import numpy as np
import matplotlib.pyplot as plt
filename = 'fg1.png'
# Load image as greyscale and calculate means of each column of pixels
im = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
means = np.mean(im, axis=0)
# Calculate the gradient of the means
y = np.gradient(means)
# Plot the gradient of the means
xdata = np.arange(0, y.shape[0])
plt.plot(xdata, y, 'bo') # blue circles
plt.title(f'Gradient of Column Means for "{filename}"')
plt.xlabel('x')
plt.ylabel('Gradient of Column Means')
plt.grid(True)
plt.show()
If you just plot the means of all columns, without taking the gradient, you get this:

computer vision - Counting small circles in an image

The image below has many circles. Click and zoom in to see the circles.
https://drive.google.com/open?id=1ox3kiRX5hf2tHDptWfgcbMTAHKCDizSI
What I want is counting the circles using any free language, such as python.
Is there a function or idea to do it?
Edit: I came up with a better solution, partially inspired by this answer below. I thought of this method originally (as noted in the OP comments) but I decided against it. The original image was just not good enough quality for it. However I improved that method and it works brilliantly for the better quality image. The original approach is first, and then the new approach at the bottom.
First approach
So here's a general approach that seems to work well, but definitely just gives estimates. This assumes that circles are roughly the same size.
First, the image is mostly blue---so it seems reasonable to just do the analysis on the blue channel. Thresholding the blue channel, in this case, using Otsu thresholding (which determines an optimal threshold value without input) seems to work very well. This isn't too much of a surprise since the distribution of color values is pretty much binary. Check the mask that results from it!
Then, do a connected component analysis on the mask to get the area of each component (component = white blob in the mask). The statistics returned from connectedComponentsWithStats() give (among other things) the area, which is exactly what we need. Then we can simply count the circles by estimating how many circles fit in a given component based on its area. Also note that I'm taking the statistics for every label except the first one: this is the background label 0, and not any of the white blobs.
Now, how large in area is a single circle? It would be best to let the data tell us. So you could compute a histogram of all the areas, and since there are more single circles than anything else, there will be a high concentration around 250-270 pixels or so for the area. Or you could just take an average of all the areas between something like 50 and 350 which should also get you in a similar ballpark.
Really in this histogram you can see the demarcations between single circles, double circles, triple, and so on quite easily. Only the larger components will give pretty rough estimates. And in fact, the area doesn't seem to scale exactly linearly. Blobs of two circles are slightly larger than two single circles, and blobs of three are larger still than three single circles, and so on, so this makes it a little difficult to estimate nicely, but rounding should still keep us close. If you want you could include a small multiplication parameter that increases as the area increases to account for that, but that would be hard to quantify without going through the histogram analytically...so, I didn't worry about this.
A single circle area divided by the average single circle area should be close to 1. And the area of a 5-circle group divided by the average circle area should be close to 5. And this also means that small insignificant components, that are 1 or 10 or even 100 pixels in area, will not count towards the total since round(50/avg_circle_size) < 1/2, so those will round down to a count of 0. Thus I should just be able to take all the component areas, divide them by the average circle size, round, and get to a decent estimate by summing them all up.
import cv2
import numpy as np
img = cv2.imread('circles.png')
mask = cv2.threshold(img[:, :, 0], 255, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
stats = cv2.connectedComponentsWithStats(mask, 8)[2]
label_area = stats[1:, cv2.CC_STAT_AREA]
min_area, max_area = 50, 350 # min/max for a single circle
singular_mask = (min_area < label_area) & (label_area <= max_area)
circle_area = np.mean(label_area[singular_mask])
n_circles = int(np.sum(np.round(label_area / circle_area)))
print('Total circles:', n_circles)
This code is simple and effective for rough counts.
However, there are definitely some assumptions here about the groups of circles compared to a normal circle size, and there are issues where circles that are at the boundaries will not be counted correctly (these aren't well defined---a two circle blob that is half cut off will look more like one circle---no clear way to count or not count these with this method). Further I just used automatic thresholding via Otsu here; you could get (probably better) results with more careful color filtering. Additionally in the mask generated by Otsu, some circles that are masked have a few pixels removed from their center. Morphology could add these pixels back in, which would give you a (slightly larger) more accurate area for the single circle components. Either way, I just wanted to give the general idea towards how you could easily estimate this with minimal code.
New approach
Before, the goal was to count circles. This new approach instead counts the centers of the circles. The general idea is you threshold and then flood fill from a background pixel to fill in the background (flood fill works like the paint bucket tool in photo editing apps), that way you only see the centers, as shown in this answer below.
However, this relies on global thresholding, which isn't robust to local lighting changes. This means that since some centers are brighter/darker than others, you won't always get good results with a single threshold.
Here I've created an animation to show looping through different threshold values; watch as some centers appear and disappear at different times, meaning you get different counts depending on the threshold you choose (this is just a small patch of the image, it happens everywhere):
Notice that the first blob to appear in the top left actually disappears as the threshold increases. However, if we actually OR each frame together, then each detected pixel persists:
But now every single speck appears, so we should clean up the mask each frame so that we remove single pixels as they come (otherwise they may build up and be hard to remove later). Simple morphological opening with a small kernel will remove them:
Applied over the whole image, this method works incredibly well and finds almost every single cell. There are only three false positives (detected blob that's not a center) and two misses I can spot, and the code is very simple. The final thing to do after the mask has been created is simply count the components, minus one for the background. The only user input required here is a single point to flood fill from that is in the background (seed_pt in the code).
img = cv2.imread('circles.png', 0)
seed_pt = (25, 25)
fill_color = 0
mask = np.zeros_like(img)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
for th in range(60, 120):
prev_mask = mask.copy()
mask = cv2.threshold(img, th, 255, cv2.THRESH_BINARY)[1]
mask = cv2.floodFill(mask, None, seed_pt, fill_color)[1]
mask = cv2.bitwise_or(mask, prev_mask)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
n_centers = cv2.connectedComponents(mask)[0] - 1
print('There are %d cells in the image.'%n_centers)
There are 874 cells in the image.
One possible solution would be to read the image using OpenCV, get its grayscale, then use Canny edge detection and perform countour finding in OpenCV. This will return a list of countours. It would look something like:
import cv2
image = cv2.imread('path-to-your-image')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# tweak the parameters of the GaussianBlur for best performance
blurred = cv2.GaussianBlur(gray, (7, 7), 0)
# again, try different values here
edged = cv2.Canny(blurred, 20, 140)
(_, contours, _) = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(len(contours))
If you have all images like this - consider thresholding it, not necessarily by auto threshold-seeking algorithm like Otsu, but rather using simplest threshold by a given threshold value. Yes, before thresholding you have to convert your color input to gray-scale, or take one of color channels. Then based on few experiments with channels and threshold values - determine threshold value to have circles with holes in monochrome thresholding result. Based on your png image I found value of 81 (intensity of gray varies from 0 to 255) to be great to threshold gray-scale version of your input to have such binary image with holes in place, as described above.
Then simply count those holes.
Holes can be determined by seed-filling white area, connected to image border. As result you will have white hole connected components on black background - so simply count them.
More details you can find here http://www.leptonica.com/filling.html and use leptonica primitives to do thresholding, hole counting an so on.

OMR: evaluate filled circle

I'm implementing OMR system for test papers. But faced with problems when determining filled circles. I've succeeded in getting these grayscale regions of interest .
The problems are:
- Binary thresholding (adaptive and fixed) and counting non zero pixels gives a lot of errors because of letters in a circles and different brightness of photos made by mobile cameras.
- Also tried technique described in this survey that uses average grayscale values of a circle do mark it filled or not, but the brightness of an image is not uniform because of different light sources when people take photos be their cameras and I got a lot of wrong results.
- People also doesn't follow rules such us filling the whole circle, algorithm also need to be robust in such cases.
Sample images
I already have about 10 GBs of samples, so may be machine learning or other statistical methods will be useful.
Does anybody know other methods to classify a circle as filled?
Since it is not a straight forward problem, it needs lot of tweaking to make it robust. But I would like suggest you a good starting point. You can play with it and try to make it work.
import numpy as np
import cv2
image_ori = cv2.imread("circle_opt.png")
lower_bound = np.array([0, 0, 0])
upper_bound = np.array([255, 255, 195])
image = image_ori
mask = cv2.inRange(image_ori, lower_bound, upper_bound)
masked_red = cv2.bitwise_and(image, image, mask=mask)
kernel = np.ones((3,3),np.uint8)
closing = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
contours = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[0]
contours.sort(key=lambda x:cv2.boundingRect(x)[0])
print len(contours)
for c in contours:
(x,y),r = cv2.minEnclosingCircle(c)
center = (int(x),int(y))
r = int(r)
if 10 <= r <= 15:
cv2.circle(image,center,r,(0,255,0),2)
# cv2.imwrite('omr_processed.png', image_ori)
cv2.imshow("original",image_ori)
cv2.waitKey(0)
The result I got from my code on the image you shared was this
You can apply thresholds to these green circled patches and then count non-zeros to get if the circle is marked or not. You can play with lower and upper_bound to try to make the solution robust.
Hope this helps! Good luck on your problem solving :)

Having difficulties detecting small objects in noisy background. Any ways to fix this?

I am trying to make a computer vision program in which it would detect litter and random trash in a noisy background such as the beach (noisy due to sand).
Original Image:
Canny Edge detection without any image processing:
I realize that a certain combination of image processing technique will help me accomplish my goal of ignoring the noisy sandy background and detect all trash and objects on the ground.
I tried to preform median blurring, play around and tune the parameters, and it gave me this:
It preforms well in terms of ignoring the sandy background, but it fails to detect some of the other many objects on the ground, possibly because it is blurred out (not too sure).
Is there any way of improving my algorithm or image processing techniques that will ignore the noisy sandy background while allowing canny edge detection to find all objects and have the program detect and draw contours on all objects.
Code:
from pyimagesearch.transform import four_point_transform
from matplotlib import pyplot as plt
import numpy as np
import cv2
import imutils
im = cv2.imread('images/beach_trash_3.jpg')
#cv2.imshow('Original', im)
# Histogram equalization to improve contrast
###
#im = np.fliplr(im)
im = imutils.resize(im, height = 500)
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
# Contour detection
#ret,thresh = cv2.threshold(imgray,127,255,0)
#imgray = cv2.GaussianBlur(imgray, (5, 5), 200)
imgray = cv2.medianBlur(imgray, 11)
cv2.imshow('Blurred', imgray)
'''
hist,bins = np.histogram(imgray.flatten(),256,[0,256])
plt_one = plt.figure(1)
cdf = hist.cumsum()
cdf_normalized = cdf * hist.max()/ cdf.max()
cdf_m = np.ma.masked_equal(cdf,0)
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint8')
imgray = cdf[imgray]
cv2.imshow('Histogram Normalization', imgray)
'''
'''
imgray = cv2.adaptiveThreshold(imgray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
'''
thresh = imgray
#imgray = cv2.medianBlur(imgray,5)
#imgray = cv2.Canny(imgray,10,500)
thresh = cv2.Canny(imgray,75,200)
#thresh = imgray
cv2.imshow('Canny', thresh)
contours, hierarchy = cv2.findContours(thresh.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(contours, key = cv2.contourArea, reverse = True)[:5]
test = im.copy()
cv2.drawContours(test, cnts, -1,(0,255,0),2)
cv2.imshow('All contours', test)
print '---------------------------------------------'
##### Code to show each contour #####
main = np.array([[]])
for c in cnts:
epsilon = 0.02*cv2.arcLength(c,True)
approx = cv2.approxPolyDP(c,epsilon,True)
test = im.copy()
cv2.drawContours(test, [approx], -1,(0,255,0),2)
#print 'Contours: ', contours
if len(approx) == 4:
print 'Found rectangle'
print 'Approx.shape: ', approx.shape
print 'Test.shape: ', test.shape
# frame_f = frame_f[y: y+h, x: x+w]
frame_f = test[approx[0,0,1]:approx[2,0,1], approx[0,0,0]:approx[2,0,0]]
print 'frame_f.shape: ', frame_f.shape
main = np.append(main, approx[None,:][None,:])
print 'main: ', main
# Uncomment in order to show all rectangles in image
#cv2.imshow('Show Ya', test)
#print 'Approx: ', approx.shape
#cv2.imshow('Show Ya', frame_f)
cv2.waitKey()
print '---------------------------------------------'
cv2.drawContours(im, cnts, -1,(0,255,0),2)
print main.shape
print main
cv2.imshow('contour-test', im)
cv2.waitKey()
what i am understanding from your problem is: you want to segment out the foreground objects from a background which is variable in nature(sand gray level is depending on many other conditions).
there are various ways to approach this kind of problem:
Approach 1:
From your image one thing is clear that, background color pixels will always much more in numbers than foreground, simplest method to start initial segmentation is:
Convert the image into gray.
Create its histogram.
Find the peak index of the histogram, i.e. index which have maximum pixels.
above three steps give you an idea of background BUT the game is not ends here, now you can put this index value in the center and take a range of values around it like 25 above and below, for example: if your peak index is 207 (as in your case) choose a range of gray level from 75 to 225 and threshold image, As according to nature of your background above method can be used for foreground object detection, after segmentation you have to perform some post processing steps like morphological analysis to segment out different objects after extraction of objects you can apply some classification stuff for finer level of segmentation to remove false positive.
Approach 2:
Play with some statistics of the image pixels, like make a small data set of gray values and
Label them class 1 and 2, for example 1 for sand and 2 for foreground,
Find out mean and variance(std deviation) of pixels from both the classes, and also calculate probability for both the class ( num_pix_per_class/total_num_pix), now store these stats for later use,
Now come back to image and take every pixel one by one and apply a gaussian pdf: 1/2*pisigma(exp(-(pix - mean)/2*sigma)); at the place of mean put the mean calculated earlier and at the sigma put std deviation calculated earlier.
after applying stage 3 you will get two probability value for each pixel for two classes, just choose the class which have higher probability.
Approach 3:
Approach 3 is more complex than above two: you can use some texture based operation to segment out sand type texture, but for applying texture based method i will recommend supervised classification than unsupervised(like k-means).
Different texture feature which you can use are:
Basic:
Range of gray levels in a defined neighborhood.
local mean and variance or entropy.
Gray Level Co-occurrence Matrices (GLCM).
Advanced:
Local Binary Patterns.
Wavelet Transform.
Gabor Transform. etc.
PS: In my opinion you should give a try to approach 1 and 2. it can solve lot of work. :)
For better results you should apply many algorithms. The OpenCV-tutorials focus always on one feature of OpenCV. The real CV-applications should use as many as possible techniques and algorithms.
I've used to detect biological cells in noisy pictures and I gained very good results applying some contextual information:
Expected size of cells
The fact that all cells have similar size
Expected number of cells
So I changed many parameters and tried to detect what I'm looking for.
If using edge detection, the sand would give rather random shapes. Try to change the canny parameters and detect lines, rects, circles, ets. - any shapes more probable for litter. Remember the positions of detected objects for each parameters-set and at the and give the priority to those positions (areas) where the shapes were detected most times.
Use color-separation. The peaks in color-histogram could be the hints to the litter, as the distribution of sand-colors should be more even.
For some often appearing, small objects like cigarette-stubs you can apply object matching.
P.S:
Cool application! Jus out of curiosity, are yoou going to scan the beach with a quadcopter?
If you want to detect objects on such uniform background, you should start by detecting the main color in the image. Like that you will detect all the sand, and the objects will be in the remaining parts. You can take a look to papers published by Arnaud LeTrotter and Ludovic Llucia who both used this type of "main color detection".

Finding location of rectangles in an image with OpenCV

I'm trying to use OpenCV to "parse" screenshots from the iPhone game Blocked. The screenshots are cropped to look like this:
I suppose for right now I'm just trying to find the coordinates of each of the 4 points that make up each rectangle. I did see the sample file squares.c that comes with OpenCV, but when I run that algorithm on this picture, it comes up with 72 rectangles, including the rectangular areas of whitespace that I obviously don't want to count as one of my rectangles. What is a better way to approach this? I tried doing some Google research, but for all of the search results, there is very little relevant usable information.
The similar issue has already been discussed:
How to recognize rectangles in this image?
As for your data, rectangles you are trying to find are the only black objects. So you can try to do a threshold binarization: black pixels are those ones which have ALL three RGB values less than 40 (I've found it empirically). This simple operation makes your picture look like this:
After that you could apply Hough transform to find lines (discussed in the topic I referred to), or you can do it easier. Compute integral projections of the black pixels to X and Y axes. (The projection to X is a vector of x_i - numbers of black pixels such that it has the first coordinate equal to x_i). So, you get possible x and y values as the peaks of the projections. Then look through all the possible segments restricted by the found x and y (if there are a lot of black pixels between (x_i, y_j) and (x_i, y_k), there probably is a line probably). Finally, compose line segments to rectangles!
Here's a complete Python solution. The main idea is:
Apply pyramid mean shift filtering to help threshold accuracy
Otsu's threshold to get a binary image
Find contours and filter using contour approximation
Here's a visualization of each detected rectangle contour
Results
import cv2
image = cv2.imread('1.png')
blur = cv2.pyrMeanShiftFiltering(image, 11, 21)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
x,y,w,h = cv2.boundingRect(approx)
cv2.rectangle(image,(x,y),(x+w,y+h),(36,255,12),2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
I wound up just building on my original method and doing as Robert suggested in his comment on my question. After I get my list of rectangles, I then run through and calculate the average color over each rectangle. I check to see if the red, green, and blue components of the average color are each within 10% of the gray and blue rectangle colors, and if they are I save the rectangle, if they aren't I discard it. This process gives me something like this:
From this, it's trivial to get the information I need (orientation, starting point, and length of each rectangle, considering the game window as a 6x6 grid).
The blocks look like bitmaps - why don't you use simple template matching with different templates for each block size/color/orientation?
Since your problem is the small rectangles I would start by removing them.
Since those lines are much thinner than the borders of the rectangles I would start by applying morphological operations on the image.
Using a structural element that looks like this:
element = [ 1 1
1 1 ]
should remove lines that are less than two pixels wide. After the small lines are removed the rectangle finding algorithm of OpenCV will most likely do the rest of the job for you.
The erosion can be done in OpenCV by the function cvErode
Try one of the many corner detectors like harris corner detector. also it is in general a good idea to try that at multiple resolutions : so do some preprocessing of of varying magnification.
It appears that you want some sort of color dominated square then you can suppress the other colors, by first using something like cvsplit .....and then thresholding the color...so only that region remains....follow that with a cropping operation ...I think that could work as well ....

Resources