Related
I'm trying to extract the geometries of the papers in the image below, but I'm having some trouble with grabbing the contours. I don't know which threshold algorithm to use (here I used static threshold = 10, which is probably not ideal.
And as you can see, I can get the correct number of images, but I can't get the proper bounds using this method.
Simply applying Otsu just doesn't work, it doesn't capture the geometries.
I assume I need to apply some edge detection, but I'm not sure what to do once I apply Canny or some other.
I also tried sobel in both directions (+ve and -ve in x and y), but unsure how to extract these contours from there.
How do I grab these contours?
Below is some previews of the images in the process of the final convex hull results.
**Original Image** **Sharpened**
**Dilate,Sharpen,Erode,Sharpen** **Convex Of Approximated Polygons Hulls (which doesn't fully capture desired regions)**
Sorry in advance about the horrible formatting, I have no idea how to make images smaller or title them nicely in SOF
I am trying to make a computer vision program in which it would detect litter and random trash in a noisy background such as the beach (noisy due to sand).
Original Image:
Canny Edge detection without any image processing:
I realize that a certain combination of image processing technique will help me accomplish my goal of ignoring the noisy sandy background and detect all trash and objects on the ground.
I tried to preform median blurring, play around and tune the parameters, and it gave me this:
It preforms well in terms of ignoring the sandy background, but it fails to detect some of the other many objects on the ground, possibly because it is blurred out (not too sure).
Is there any way of improving my algorithm or image processing techniques that will ignore the noisy sandy background while allowing canny edge detection to find all objects and have the program detect and draw contours on all objects.
Code:
from pyimagesearch.transform import four_point_transform
from matplotlib import pyplot as plt
import numpy as np
import cv2
import imutils
im = cv2.imread('images/beach_trash_3.jpg')
#cv2.imshow('Original', im)
# Histogram equalization to improve contrast
###
#im = np.fliplr(im)
im = imutils.resize(im, height = 500)
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
# Contour detection
#ret,thresh = cv2.threshold(imgray,127,255,0)
#imgray = cv2.GaussianBlur(imgray, (5, 5), 200)
imgray = cv2.medianBlur(imgray, 11)
cv2.imshow('Blurred', imgray)
'''
hist,bins = np.histogram(imgray.flatten(),256,[0,256])
plt_one = plt.figure(1)
cdf = hist.cumsum()
cdf_normalized = cdf * hist.max()/ cdf.max()
cdf_m = np.ma.masked_equal(cdf,0)
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint8')
imgray = cdf[imgray]
cv2.imshow('Histogram Normalization', imgray)
'''
'''
imgray = cv2.adaptiveThreshold(imgray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
'''
thresh = imgray
#imgray = cv2.medianBlur(imgray,5)
#imgray = cv2.Canny(imgray,10,500)
thresh = cv2.Canny(imgray,75,200)
#thresh = imgray
cv2.imshow('Canny', thresh)
contours, hierarchy = cv2.findContours(thresh.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(contours, key = cv2.contourArea, reverse = True)[:5]
test = im.copy()
cv2.drawContours(test, cnts, -1,(0,255,0),2)
cv2.imshow('All contours', test)
print '---------------------------------------------'
##### Code to show each contour #####
main = np.array([[]])
for c in cnts:
epsilon = 0.02*cv2.arcLength(c,True)
approx = cv2.approxPolyDP(c,epsilon,True)
test = im.copy()
cv2.drawContours(test, [approx], -1,(0,255,0),2)
#print 'Contours: ', contours
if len(approx) == 4:
print 'Found rectangle'
print 'Approx.shape: ', approx.shape
print 'Test.shape: ', test.shape
# frame_f = frame_f[y: y+h, x: x+w]
frame_f = test[approx[0,0,1]:approx[2,0,1], approx[0,0,0]:approx[2,0,0]]
print 'frame_f.shape: ', frame_f.shape
main = np.append(main, approx[None,:][None,:])
print 'main: ', main
# Uncomment in order to show all rectangles in image
#cv2.imshow('Show Ya', test)
#print 'Approx: ', approx.shape
#cv2.imshow('Show Ya', frame_f)
cv2.waitKey()
print '---------------------------------------------'
cv2.drawContours(im, cnts, -1,(0,255,0),2)
print main.shape
print main
cv2.imshow('contour-test', im)
cv2.waitKey()
what i am understanding from your problem is: you want to segment out the foreground objects from a background which is variable in nature(sand gray level is depending on many other conditions).
there are various ways to approach this kind of problem:
Approach 1:
From your image one thing is clear that, background color pixels will always much more in numbers than foreground, simplest method to start initial segmentation is:
Convert the image into gray.
Create its histogram.
Find the peak index of the histogram, i.e. index which have maximum pixels.
above three steps give you an idea of background BUT the game is not ends here, now you can put this index value in the center and take a range of values around it like 25 above and below, for example: if your peak index is 207 (as in your case) choose a range of gray level from 75 to 225 and threshold image, As according to nature of your background above method can be used for foreground object detection, after segmentation you have to perform some post processing steps like morphological analysis to segment out different objects after extraction of objects you can apply some classification stuff for finer level of segmentation to remove false positive.
Approach 2:
Play with some statistics of the image pixels, like make a small data set of gray values and
Label them class 1 and 2, for example 1 for sand and 2 for foreground,
Find out mean and variance(std deviation) of pixels from both the classes, and also calculate probability for both the class ( num_pix_per_class/total_num_pix), now store these stats for later use,
Now come back to image and take every pixel one by one and apply a gaussian pdf: 1/2*pisigma(exp(-(pix - mean)/2*sigma)); at the place of mean put the mean calculated earlier and at the sigma put std deviation calculated earlier.
after applying stage 3 you will get two probability value for each pixel for two classes, just choose the class which have higher probability.
Approach 3:
Approach 3 is more complex than above two: you can use some texture based operation to segment out sand type texture, but for applying texture based method i will recommend supervised classification than unsupervised(like k-means).
Different texture feature which you can use are:
Basic:
Range of gray levels in a defined neighborhood.
local mean and variance or entropy.
Gray Level Co-occurrence Matrices (GLCM).
Advanced:
Local Binary Patterns.
Wavelet Transform.
Gabor Transform. etc.
PS: In my opinion you should give a try to approach 1 and 2. it can solve lot of work. :)
For better results you should apply many algorithms. The OpenCV-tutorials focus always on one feature of OpenCV. The real CV-applications should use as many as possible techniques and algorithms.
I've used to detect biological cells in noisy pictures and I gained very good results applying some contextual information:
Expected size of cells
The fact that all cells have similar size
Expected number of cells
So I changed many parameters and tried to detect what I'm looking for.
If using edge detection, the sand would give rather random shapes. Try to change the canny parameters and detect lines, rects, circles, ets. - any shapes more probable for litter. Remember the positions of detected objects for each parameters-set and at the and give the priority to those positions (areas) where the shapes were detected most times.
Use color-separation. The peaks in color-histogram could be the hints to the litter, as the distribution of sand-colors should be more even.
For some often appearing, small objects like cigarette-stubs you can apply object matching.
P.S:
Cool application! Jus out of curiosity, are yoou going to scan the beach with a quadcopter?
If you want to detect objects on such uniform background, you should start by detecting the main color in the image. Like that you will detect all the sand, and the objects will be in the remaining parts. You can take a look to papers published by Arnaud LeTrotter and Ludovic Llucia who both used this type of "main color detection".
I'm attempting to wrap my head around the basics of CV. The bit that initially got me interested was template matching (it was mentioned in a Pycon talk unrelated to CV), so I figured I'd start there.
I started with this image:
Out of which I want to detect Mario. So I cut him out:
I understand the concept of sliding the template around the image to see the best fit, and following a tutorial, I'm able to find mario with the following code:
def match_template(img, template):
s = time.time()
img_size = cv.GetSize(img)
template_size = cv.GetSize(template)
img_result = cv.CreateImage((img_size[0] - template_size[0] + 1,
img_size[1] - template_size[1] + 1), cv.IPL_DEPTH_32F, 1)
cv.Zero(img_result)
cv.MatchTemplate(img, template, img_result, cv.CV_TM_CCORR_NORMED)
min_val, max_val, min_loc, max_loc = cv.MinMaxLoc(img_result)
# inspect.getargspec(cv.MinMaxLoc)
print min_val
print max_val
print min_loc
print max_loc
cv.Rectangle(img, max_loc, (max_loc[0] + template.width, max_loc[1] + template.height), cv.Scalar(120.), 2)
print time.time() - s
cv.NamedWindow("Result")
cv.ShowImage("Result", img)
cv.WaitKey(0)
cv.DestroyAllWindows()
So far so good, but then I came to realize that this is incredibly fragile. It will only ever find Mario with that specific background, and with that specific animation frame being displayed.
So I'm curious, given that Mario will always have the same Mario-ish attributes, (size, colors) is there a technique with which I could find him regardless of whether his currect frame is standing still, or one of the various run cycle sprites? Kind of like fuzzy matching that you can do on strings, but for images.
Maybe since he's the only red thing, there is a way of simply tracking the red pixels?
The whole other issue is removing the background from the template. Maybe that would help the MatchTemplate function find Mario even though he doesn't exactly match the tempate? As of now, I'm not entirely sure how that would work ( I see that there is a mask param in MatchTemplate, but I'll have to investigate further)
My main question is whether or not template matching is the way to go about detecting an image that is mostly the same, but varies (like when he's walking), or is there another technique I should look into?
Update:
Attempts at matching other Marios
Going off of mmgp's suggestion that it should be workable for matching other things, I ran a couple of tests.
I used this as the template to match:
And then took a couple of screen shots to test the matching against.
For the first, I successfully find Mario, and get a max value of 1.
However, trying to find jumping Mario results in a complete misfire.
Now granted, the mario in the template, and the mario in the scene is facing opposite directions, as well as being different animation frames, but I would think they still match a lot more than anything else in the image -- if only for the colors alone. But it targets the platform as being the closest match to the template.
Note that the max value for this one was 0.728053808212.
Next I tried a scene without mario to see what would happen.
But oddly enough, I get the exact result as the image with jumping mario -- right down to the similarity value: 0.728053808212. Mario being in the picture is just as accurate as him not being in the picture.
Really strange! I don't know the actual details of the underlying algorithm, but I'd imagine, from a standard deviation perspective, the boxes in the scene that at least match the Red in template Mario's suit would be closer to the mean distance than a blue platform, right? So, it's extra confusing that it's not even in the general area of where I would expect it to be.
I'm guessing this is user error on my end, or maybe just a misunderstanding.
Why would a scene with a similar Mario have as much of a match as a scene with no Mario at all?
No method is infallible, but template matching do have a good chance to work there. It might require some pre-processing, and until there is a larger sample (a short video for example) to demonstrate the possible problems, there isn't much point in trying more advanced methods simply because some library implement them for you -- especially if you don't know under which conditions they are expected to work.
For instance, here are the results I get using template matching (red rectangles) -- all them are using the template http://i.stack.imgur.com/EYs9B.png, even the last one:
To achieve that I started by considering only the red channel of both the template and the input image. From that we easily calculate the internal morphological gradient and only then perform the matching. In order to not get a rectangle when Mario is not present, it is needed to set a minimum threshold for the matching. Here is the template and one of the images after these two transformations:
And here is some sample code to achieve that:
import sys
import cv2
import numpy
img = cv2.imread(sys.argv[1])
img2 = img[:,:,2]
img2 = img2 - cv2.erode(img2, None)
template = cv2.imread(sys.argv[2])[:,:,2]
template = template - cv2.erode(template, None)
ccnorm = cv2.matchTemplate(img2, template, cv2.TM_CCORR_NORMED)
print ccnorm.max()
loc = numpy.where(ccnorm == ccnorm.max())
threshold = 0.4
th, tw = template.shape[:2]
for pt in zip(*loc[::-1]):
if ccnorm[pt[::-1]] < threshold:
continue
cv2.rectangle(img, pt, (pt[0] + tw, pt[1] + th),
(0, 0, 255), 2)
cv2.imwrite(sys.argv[2], img)
I expect it to fail in more varied situations, but there are a couple of easy adjustments to be done.
Template matching doesn't always give good results. you should look into Keypoints matching.
Step1: Find Keypoints
Let's assume that you managed to cut out Mario or get ROI image of mario. Make this image your template image. Now, find keypoints in the main image and also in the template. So now you have two sets of keypoints. One for the image and other for Mario(template).
You can use SIFT, SURF, ORB depending on your preferences.
[EDIT]:
This is what I got using this method with SIFT and flann based knn matching. I haven't done the bounding box part.
Since your template is very small, SIFT and SURF would not give many keypoints. But to get good number of feature points, you could try Harris Corner detector. I applied Harris corner on the image and I got pretty good points on Mario.
Step2: Match Keypoints
If you have used SIFT or SURF, you'd have descriptors of both the image and the template. Match these keypoints using KNN or some other efficient matching algorithm. If you are using OpenCV, I'd suggest you to look into flannbased matcher. After matching the keypoints, you would want to filter out the incorrect matches. You can do this by K- nearest neighbors and depending upon the distance of the nearest match you can further filter out keypoints. You can further filter your matches using Forward-Backward Error.
Forward-Backward Error Estimation:
Match template keypoints to the image keypoints This will give you a set of matches.
Match the image keypoints to the template keypoints. This will give you another set of matches.
Common set of both these set will filter out incorrect matches.
[EDIT]:
If you are using Harris Corner detector, you'd get only points and not keypoints. You can either convert them into keypoints or write your own brute force mathcer. It's not that difficult.
Step3: Estimation
After filtering the keypoints, you'd have a cluster of keypoints near your object (in this case, Mario) and few scattered keypoints. To eliminate these scattered keypoints, you could use clustering. DBSCAN clustering will help you get a good cluster of points.
step4: Bounding Box Estimation
Now you have a cluster of keypoints. Using k-means, you should try to find the center of the cluster. Once you obtain the center of the cluster, you can estimate the bounding box.
I hope this helps.
[EDIT]
Trying to match points using Harris Corners. After filtering Harris corners, I'm using brute force method to match the points. some better algorithm might give you better results.
I have this project where I need (on iOS) to detect simple geometric shapes inside an image.
After searching the internet I have concluded that the best tool for this is OpenCV. The thing is that up until two hours ago I had no idea what OpenCV is and I have never even remotely did anything involving image processing. My main experience is JS/HTML,C#,SQL,Objective-C...
Where do I start with this?
I have found this answer that I was able to digest and by reading already other stuff, I understand that OpenCV should return an Array of shapes with the points/corners, is this true? Also how will it represent a circle or a half circle?
Also what about the shape orientation?
Do you know of any Demo iOS project that can demonstrate a similar functionality?
If you have only these regular shapes, there is a simple procedure as follows :
Find Contours in the image ( image should be binary as given in your question)
Approximate each contour using approxPolyDP function.
First, check number of elements in the approximated contours of all the shapes. It is to recognize the shape. For eg, square will have 4, pentagon will have 5. Circles will have more, i don't know, so we find it. ( I got 16 for circle and 9 for half-circle.)
Now assign the color, run the code for your test image, check its number, fill it with corresponding colors.
Below is my example in Python:
import numpy as np
import cv2
img = cv2.imread('shapes.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,1)
contours,h = cv2.findContours(thresh,1,2)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
print len(approx)
if len(approx)==5:
print "pentagon"
cv2.drawContours(img,[cnt],0,255,-1)
elif len(approx)==3:
print "triangle"
cv2.drawContours(img,[cnt],0,(0,255,0),-1)
elif len(approx)==4:
print "square"
cv2.drawContours(img,[cnt],0,(0,0,255),-1)
elif len(approx) == 9:
print "half-circle"
cv2.drawContours(img,[cnt],0,(255,255,0),-1)
elif len(approx) > 15:
print "circle"
cv2.drawContours(img,[cnt],0,(0,255,255),-1)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Below is the output:
Remember, it works only for regular shapes.
Alternatively to find circles, you can use houghcircles. You can find a tutorial here.
Regarding iOS, OpenCV devs are developing some iOS samples this summer, So visit their site : www.code.opencv.org and contact them.
You can find slides of their tutorial here : http://code.opencv.org/svn/gsoc2012/ios/trunk/doc/CVPR2012_OpenCV4IOS_Tutorial.pdf
The answer depends on the presence of other shapes, level of noise if any and invariance you want to provide for (e.g. rotation, scaling, etc). These requirements will define not only the algorithm but also required pre-procesing stages to extract features.
Template matching that was suggested above works well when shapes aren't rotated or scaled and when there are no similar shapes around; in other words, it finds a best translation in the image where template is located:
double minVal, maxVal;
Point minLoc, maxLoc;
Mat image, template, result; // template is your shape
matchTemplate(image, template, result, CV_TM_CCOEFF_NORMED);
minMaxLoc(result, &minVal, &maxVal, &minLoc, &maxLoc); // maxLoc is answer
Geometric hashing is a good method to get invariance in terms of rotation and scaling; this method would require extraction of some contour points.
Generalized Hough transform can take care of invariance, noise and would have minimal pre-processing but it is a bit harder to implement than other methods. OpenCV has such transforms for lines and circles.
In the case when number of shapes is limited calculating moments or counting convex hull vertices may be the easiest solution: openCV structural analysis
You can also use template matching to detect shapes inside an image.
What is the best way to match the scan (taken photo) point sets to the template point set (blue,green,red,pink circles in the images)?
I am using opencv/c++. Maybe some kind of the ICP algorithm? I would like to wrap the scan image to the template image!
template point set:
scan point set:
If the object is reasonably rigid and aligned, simple auto-correlation would do the trick.
If not, I would use RANSAC to estimate the transformation between the subject and the template (it seems that you have the feature points). Please provide some details on the problem.
Edit:
RANSAC (Random Sample Consensus) could be used in your case. Think about unnecessary points in your template as noise (false features detected by a feature detector) - they are the outliners. RANSAC could handle outliners, because it choose a small subset of feature points (the minimal amount that could initiate your model) randomly, initiates the model and calculates how well your model match the given data (how many other points in the template correspond to your other points). If you choose wrong subset, this value will be low and you will drop the model. If you choose right subset it will be high and you could improve your match with an LMS algorithm.
Do you have to match the red rectangles? The original image contains four black rectangles in the corners that seem to be made for matching. I can reliably find them with 4 lines of Mathematica code:
lotto = [source image]
lottoBW = Image[Map[Max, ImageData[lotto], {2}]]
This takes max(R,G,B) for each pixel, i.e. it filters out the red and yellow print (more or less). The result looks like this:
Then I just use a LoG filter to find the dark spots and look for local maxima in the result image
lottoBWG = ImageAdjust[LaplacianGaussianFilter[lottoBW, 20]]
MaxDetect[lottoBWG, 0.5]
Result:
Have you looked at OpenCV's descriptor_extractor_matcher.cpp sample? This sample uses RANSAC to detect the homography between the two input images. I assume when you say wrap you actually mean warp? If you would like to warp the image with the homography matrix you detect, have a look at the warpPerspective function. Finally, here are some good tutorials using the different feature detectors in OpenCV.
EDIT :
You may not have SURF features, but you certainly have feature points with different classes. Feature based matching is generally split into two phases: feature detection (which you have already done), and extraction which you need for matching. So, you might try converting your features into a KeyPoint and then doing the feature extraction and matching. Here is a little code snippet of how you might go about this:
typedef int RED_TYPE = 1;
typedef int GREEN_TYPE = 2;
typedef int BLUE_TYPE = 3;
typedef int PURPLE_TYPE = 4;
struct BenFeature
{
Point2f pt;
int classId;
};
vector<BenFeature> benFeatures;
// Detect the features as you normally would in addition setting the class ID
vector<KeyPoint> keypoints;
for(int i = 0; i < benFeatures.size(); i++)
{
BenFeature bf = benFeatures[i];
KeyPoint kp(bf.pt,
10.0, // feature neighborhood diameter (you'll probaby need to tune it)
-1.0, // (angle) -1 == not applicable
500.0, // feature response strength (set to the same unless you have a metric describing strength)
1, // octave level, (ditto as above)
bf.classId // RED, GREEN, BLUE, or PURPLE.
);
keypoints.push_back(kp);
}
// now proceed with extraction and matching...
You may need to tune the response strength such that it doesn't get thresholded out by the extraction phase. But, hopefully that's illustrative of what you might try to do.
Follow these steps:
Match points or features in two images, this will determine your wrapping;
Determine what transformation you are looking for for your wrapping. The most general would be homography (see cv::findHomography()) and the less general would be a simple translation (use cv::matchTempalte()). The intermediate case would be translation along x, y and rotation. For this I wrote a fast function that is better than Homography since it uses less degrees of freedom while still optimizing the right metrics (squared differences in coordinates):
https://stackoverflow.com/a/18091472/457687
If you think your matches have a lot of outliers use RANSAC on top of your step 1. You basically need to randomly select a minimal set of points required for finding parameters, solve, determine inliers, solve again using all inliers, and then iterate trying to improve your current solution (increase the number of inliers, reduce error, or both). See Wikipedia for RANSAC algorithm: http://en.wikipedia.org/wiki/Ransac