I'm trying to find coins at different images and mark their location. Coins always are perfect circles (not ellipses), but they can touch or even overlap. Here are some example images, as well as results of my tries (a Python script using skimage and its outputs), but it doesn't seem to perform well.
The script:
def edges(img, t):
#adapt_rgb(each_channel)
def filter_rgb(image):
sigma = 1
return feature.canny(image, sigma=sigma, low_threshold=t/sigma/2, high_threshold=t/sigma)
edges = color.rgb2hsv(filter_rgb(img))
edges = edges[..., 2]
return edges
images = io.ImageCollection('*.bmp', conserve_memory=True)
for i, im in enumerate(images):
es = edges(im, t=220)
output = im.copy()
circles = cv2.HoughCircles((es*255).astype(np.uint8), cv2.cv.CV_HOUGH_GRADIENT, dp=1, minDist=50, param2=50, minRadius=0, maxRadius=0)
if circles is not None:
circles = np.round(circles[0, :]).astype("int")
for (x, y, r) in circles:
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.rectangle(output, (x - 5, y - 5), (x + 5, y + 5), (0, 128, 255), -1)
# now es is edges
# and output is image with marked circles
A couple of example images, with detected edges and circles:
I am using canny edge detection & hough transform, which is the most common way to detect circles. However, with the same parameters it finds almost nothing on some photos, and finds way too many circles on other.
Can you give me any pointers and suggestions on how to do this better?
I ended up using dlib's object detector and it performed very well. The detector can be easily applied for detecting any kind of objects. For some related discussion see the question topic on reddit.
Hmmm, I would do some morphological operations in the canny results, such
as closing and opening operations:
http://en.wikipedia.org/wiki/Mathematical_morphology
I would also recommend you to take a look in the watershed scheme. Applied
directly into the image gradient and then a Hough Transform on it.
http://en.wikipedia.org/wiki/Watershed_%28image_processing%29
Related
I am trying to identify a rectangle underwater in a noisy environment. I implemented Canny to find the edges, and drew the found edges using cv2.circle. From here, I am trying to identify the imperfect rectangle in the image (the black one below the long rectangle that covers the top of the frame)
I have attempted multiple solutions, including thresholds, blurs and resizing the image to detect the rectangle. Below is the barebones code with just drawing the identified edges.
import numpy as np
import cv2
import imutils
img_text = 'img5.png'
img = cv2.imread(img_text)
original = img.copy()
min_value = 50
max_value = 100
# draw image and return coordinates of drawn pixels
image = cv2.Canny(img, min_value, max_value)
indices = np.where(image != 0)
coordinates = zip(indices[1], indices[0])
for point in coordinates:
cv2.circle(original, point, 1, (0, 0, 255), -1)
cv2.imshow('original', original)
cv2.waitKey(0)
cv2.destroyAllWindows()
Where the output displays this:
output
From here I want to be able to separately detect just the rectangle and draw another rectangle on top of the output in green, but I haven't been able to find a way to detect the original rectangle on its own.
For your specific image, I obtained quite good results with a simple thresholding on the blue channel.
image = cv2.imread("test.png")
t, img = cv2.threshold(image[:,:,0], 80, 255, cv2.THRESH_BINARY)
In order to adapt the threshold, I propose a simple way of varying the threshold until you get one component. I have also implemented the rectangle drawing:
def find_square(image):
markers = 0
threshold = 10
while np.amax(markers) == 0:
threshold += 5
t, img = cv2.threshold(image[:,:,0], threshold, 255, cv2.THRESH_BINARY_INV)
_, markers = cv2.connectedComponents(img)
kernel = np.ones((5,5),np.uint8)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
img = cv2.morphologyEx(img, cv2.MORPH_DILATE, kernel)
nonzero = cv2.findNonZero(img)
x, y, w, h = cv2.boundingRect(nonzero)
cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow("image", image)
And the results on the provided example images:
The idea behind this approach is based on the observation that the most information is in the blue channel. If you separate the images in the channels, you will see that in the blue channel, the dark square has the best contrast. It is also the darkest region on this channel, which is why thresholding works. The problem remains the threshold setting. Based on the above intuition, we are looking for the lowest threshold that will bring up something (and hope that it will be the square). What I did is to simply increase gradually the threshold until something appears.
Then, I applied some morphology operations to eliminate other small points that may appear after thresholding and to make the square look a bit bigger (the edges of the square are lighter, and therefore not the entire square is captured). Then is was a matter of drawing the rectangle.
The code can be made much nicer (and more efficient) by doing some statistical analysis on the histogram. Simply compute the threshold such that 5% (or some percent) of the pixels are darker. You may require do so a connected component analysis to keep the biggest blob.
Also, my usage of connectedComponents is very poor and inefficient. Again, code written in a hurry to prove the concept.
I'd like to extract the contours of an image, expressed as a sequence of point coordinates.
With Canny I'm able to produce a binary image that contains only the edges of the image. Then, I'm trying to use findContours to extract the contours. The results are not OK, though.
For each edge I often got 2 lines, like if it was considered as a very thin area.
I would like to simplify my contours so I can draw them as single lines. Or maybe extract them with a different function that directly produce the correct result would be even better.
I had a look on the documentation of OpenCV but I was't able to find anything useful, but I guess that I'm not the first one with a similar problem. Is there any function or method I could use?
Here is the Python code I've written so far:
def main():
img = cv2.imread("lena-mono.png", 0)
if img is None:
raise Exception("Error while loading the image")
canny_img = cv2.Canny(img, 80, 150)
contours, hierarchy = cv2.findContours(canny_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
contours_img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
scale = 10
contours_img = cv2.resize(contours_img, (0, 0), fx=scale, fy=scale)
for cnt in contours:
color = np.random.randint(0, 255, (3)).tolist()
cv2.drawContours(contours_img,[cnt*scale], 0, color, 1)
cv2.imwrite("canny.png", canny_img)
cv2.imwrite("contours.png", contours_img)
The scale factor is used to highlight the double lines of the contours.
Here are the links to the images:
Lena greyscale
Edges extracted with Canny
Contours: 10x zoom where you can see the wrong results produced by findContours
Any suggestion will be greatly appreciated.
If I understand you right, your question has nothing to do with finding lines in a parametric (Hough transform) sense.
Rather, it is an issue with the findContours method returning multiple contours for a single line.
This is because Canny is an edge detector - that means it is filter attuned to the image intensity gradient which occurs on both sides of a line.
So your question is more akin to: “how can I convert low-level edge features to single line?”, or perhaps: “how can I navigate the contours hierarchy to detect single lines?"
This is a fairly common topic - and here is a previous post which proposed one solution:
OpenCV converting Canny edges to contours
I have the following image
I'm trying to find the pixel coordinates of main rectangles (those between white lines). I tried few things but I can't obtain good enough solution. The solution doesn't have to be perfect and it's ok if not all rectangles are detected (especially those really small ones). Though corners location will have to be as much exact as possible, especially with those bigger blurry (I'm trying to write some simple AR engine).
I can clarify there are only 4 levels of grayscale: 0, 110, 180 and 255 (when printing, no screen it will vary because of lightning and shadows)
So far I tried few things:
manual multilevel thresholding (because of shadows and different lightning it didn't work)
adaptive thresholding : 2 problems:
it combines 180 and 255 colors into white, and 0, 110 into black
edge/corner location of blurred(bigger) rectangles is not exact (it adds blur to rectangle area)
sobel edge detection (corners of blurred rectangles are more sharp, but it detects also inner edges in rectangles, also those edge contours are not always closed
Looks like combining those two thinks somehow would yield better results. Or maybe somebody have different idea?
I was also thinking about doing floodfill, but it was hard to find for sure good seed point and threshold automatically (there might be some other white objects in the background). Besides I will want to optimize later this for GPU and floodfill algorithm is rather not a good fit for this.
Below is some sample code I tried so far:
image = cv2.imread('data/image.jpg');
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
cv2.imshow('image', gray)
adaptive = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 601, 0)
cv2.imshow('adaptive', adaptive)
gradx = cv2.Sobel(gray, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=3)
grady = cv2.Sobel(gray, ddepth=cv2.CV_32F, dx=0, dy=1, ksize=3)
abs_gradx = cv2.convertScaleAbs(grady)
abs_grady = cv2.convertScaleAbs(grady)
grad = cv2.addWeighted(abs_gradx, 0.5, abs_grady, 0.5, 0)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
grad = cv2.morphologyEx(grad, cv2.MORPH_OPEN, kernel)
grad = cv2.morphologyEx(grad, cv2.MORPH_CLOSE, kernel)
cv2.imshow('sobel',grad)
#kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(7,7))
#grad = cv2.morphologyEx(grad, cv2.MORPH_OPEN, kernel)
retval, grad = cv2.threshold(grad, 10, 255, cv2.THRESH_BINARY)
cv2.imshow('sobel+morph+thrs',grad)
cv2.waitKey()
I believe your answer lies in using the Hough transform to detect lines, extending these lines to span the breaks between the darker squares, and then looking for intersects marking out corners. I've had a quick play in Matlab and have come up with the following, it's not perfect but should show the potential:
% Open image
i = imread('http://i.stack.imgur.com/kwcXm.jpg');
% Use a sharpening filter to enhance some of the edges
H = fspecial('unsharp');
i = imfilter(i, H, 'replicate');
% Detect edge segments using canny
BW = edge(i, 'canny');
% Apply hough transform to edges
[H, T, R] = hough(BW, 'RhoResolution', 0.5, 'Theta', -90:0.5:89.5);
% Find peaks in hough transform
P = houghpeaks(H, 5, 'threshold', ceil(0.1*max(H(:))));
% Extract lines from peaks, extending partial lines
lines = houghlines(BW, T, R, P, 'FillGap', 100, 'MinLength', 5);
% Plot detected lines on image
imshow(i); hold on;
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
end
With the final result:
Obviously there's room for improvement, with a number of lines still to detect, but if tweaking the various parameters doesn't work you could take the initial result and the search for more lines with similar angles to obtain a more complete set. The corners can then be found from the intersects which should be simple enough to extract.
I would try the following:
Hough transform to detect all straight lines
Look for sets of parallel lines that are:
A sufficient distance apart
Separated by the lighter color
There are a couple of things that make your problem trickier than it needs to be:
Perspective distortion
Changes in lighting, minor shadows
If you could minimize the above, it may help solving the problem.
I'm intending to write a program to detect and differentiate certain objects from a nearly solid background. The foreground and the background have a high contrast difference which I would further increase to aid in the object identification process. I'm planning to use Hough transform technique and OpenCV.
Sample image
As seen in the above image, I would want to separately identify the circular objects and the square objects (or any other shape out of a finite set of shapes). Since I'm quite new to image processing I do not have an idea whether such a situation needs a neural network to be implemented and each shape to be learned beforehand. Would a technique such as template matching let me do this without a neural network?
These posts will get you started:
How to detect circles
How to detect squares
How to detect a sheet of paper (advanced square detection)
You will probably have to adjust some parameters in these codes to match your circles/squares, but the core of the technique is shown on these examples.
If you intend to detect shapes other than just circles, (and from the image I assume you do), I would recommend the Chamfer matching for a quick start, especially as you have a good contrast.
The basic premise, explained in simple terms, is following:
You do an edge detection (for example, cvCanny in opencv)
You create a distance image, where the value of each pixel means the distance fom the nearest edge.
You take the shapes you would like to detect, define sample points along the edges of the shape, and try to match these points on the distance image. Basically you just add the values on the distance image which are "under" the coordinates of your sample points, given a specific position of your objects.
Find a good minimization algorithm, the effectiveness of this depends on your application.
This basic approach is a general solution, usually works well, but without further advancements, it is very slow.
Usually it's a good idea to first separate the objects of interest, so you don't have to always do the full search on the whole image. Find a good threshold, so you can separate objects. You still don't know which object it is, but you only have to do the matching itself in close proximity of this object.
Another good idea is, instead of doing the full search on the high resolution image, first do it on a very low resolution. The result will not be very accurate, but you can know the general areas where it's worth to do a search on a higher resolution, so you don't waste your time on areas where there is nothing of interest.
There are a number of more advanced techniques, but it's still worth to take a look at the basic chamfer matching, as it is the base of a large number of techniques.
With the assumption that the objects are simple shapes, here's an approach using thresholding + contour approximation. Contour approximation is based on the assumption that a curve can be approximated by a series of short line segments which can be used to determine the shape of a contour. For instance, a triangle has three vertices, a square/rectangle has four vertices, a pentagon has five vertices, and so on.
Obtain binary image. We load the image, convert to grayscale, Gaussian blur, then adaptive threshold to obtain a binary image.
Detect shapes. Find contours and identify the shape of each contour using contour approximation filtering. This can be done using arcLength to compute the perimeter of the contour and approxPolyDP to obtain the actual contour approximation.
Input image
Detected objects highlighted in green
Labeled contours
Code
import cv2
def detect_shape(c):
# Compute perimeter of contour and perform contour approximation
shape = ""
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.04 * peri, True)
# Triangle
if len(approx) == 3:
shape = "triangle"
# Square or rectangle
elif len(approx) == 4:
(x, y, w, h) = cv2.boundingRect(approx)
ar = w / float(h)
# A square will have an aspect ratio that is approximately
# equal to one, otherwise, the shape is a rectangle
shape = "square" if ar >= 0.95 and ar <= 1.05 else "rectangle"
# Star
elif len(approx) == 10:
shape = "star"
# Otherwise assume as circle or oval
else:
shape = "circle"
return shape
# Load image, grayscale, Gaussian blur, and adaptive threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,31,3)
# Find contours and detect shape
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Identify shape
shape = detect_shape(c)
# Find centroid and label shape name
M = cv2.moments(c)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
cv2.putText(image, shape, (cX - 20, cY), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
What is the best way to detect the corners of an invoice/receipt/sheet-of-paper in a photo? This is to be used for subsequent perspective correction, before OCR.
My current approach has been:
RGB > Gray > Canny Edge Detection with thresholding > Dilate(1) > Remove small objects(6) > clear boarder objects > pick larges blog based on Convex Area. > [corner detection - Not implemented]
I can't help but think there must be a more robust 'intelligent'/statistical approach to handle this type of segmentation. I don't have a lot of training examples, but I could probably get 100 images together.
Broader context:
I'm using matlab to prototype, and planning to implement the system in OpenCV and Tesserect-OCR. This is the first of a number of image processing problems I need to solve for this specific application. So I'm looking to roll my own solution and re-familiarize myself with image processing algorithms.
Here are some sample image that I'd like the algorithm to handle: If you'd like to take up the challenge the large images are at http://madteckhead.com/tmp
(source: madteckhead.com)
(source: madteckhead.com)
(source: madteckhead.com)
(source: madteckhead.com)
In the best case this gives:
(source: madteckhead.com)
(source: madteckhead.com)
(source: madteckhead.com)
However it fails easily on other cases:
(source: madteckhead.com)
(source: madteckhead.com)
(source: madteckhead.com)
EDIT: Hough Transform Progress
Q: What algorithm would cluster the hough lines to find corners?
Following advice from answers I was able to use the Hough Transform, pick lines, and filter them. My current approach is rather crude. I've made the assumption the invoice will always be less than 15deg out of alignment with the image. I end up with reasonable results for lines if this is the case (see below). But am not entirely sure of a suitable algorithm to cluster the lines (or vote) to extrapolate for the corners. The Hough lines are not continuous. And in the noisy images, there can be parallel lines so some form or distance from line origin metrics are required. Any ideas?
(source: madteckhead.com)
I'm Martin's friend who was working on this earlier this year. This was my first ever coding project, and kinda ended in a bit of a rush, so the code needs some errr...decoding...
I'll give a few tips from what I've seen you doing already, and then sort my code on my day off tomorrow.
First tip, OpenCV and python are awesome, move to them as soon as possible. :D
Instead of removing small objects and or noise, lower the canny restraints, so it accepts more edges, and then find the largest closed contour (in OpenCV use findcontour() with some simple parameters, I think I used CV_RETR_LIST). might still struggle when it's on a white piece of paper, but was definitely providing best results.
For the Houghline2() Transform, try with the CV_HOUGH_STANDARD as opposed to the CV_HOUGH_PROBABILISTIC, it'll give rho and theta values, defining the line in polar coordinates, and then you can group the lines within a certain tolerance to those.
My grouping worked as a look up table, for each line outputted from the hough transform it would give a rho and theta pair. If these values were within, say 5% of a pair of values in the table, they were discarded, if they were outside that 5%, a new entry was added to the table.
You can then do analysis of parallel lines or distance between lines much more easily.
Hope this helps.
Here's what I came up with after a bit of experimentation:
import cv, cv2, numpy as np
import sys
def get_new(old):
new = np.ones(old.shape, np.uint8)
cv2.bitwise_not(new,new)
return new
if __name__ == '__main__':
orig = cv2.imread(sys.argv[1])
# these constants are carefully picked
MORPH = 9
CANNY = 84
HOUGH = 25
img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
cv2.GaussianBlur(img, (3,3), 0, img)
# this is to recognize white on white
kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
dilated = cv2.dilate(img, kernel)
edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)
lines = cv2.HoughLinesP(edges, 1, 3.14/180, HOUGH)
for line in lines[0]:
cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
(255,0,0), 2, 8)
# finding contours
contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
cv.CV_CHAIN_APPROX_TC89_KCOS)
contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)
# simplify contours down to polygons
rects = []
for cont in contours:
rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
rects.append(rect)
# that's basically it
cv2.drawContours(orig, rects,-1,(0,255,0),1)
# show only contours
new = get_new(img)
cv2.drawContours(new, rects,-1,(0,255,0),1)
cv2.GaussianBlur(new, (9,9), 0, new)
new = cv2.Canny(new, 0, CANNY, apertureSize=3)
cv2.namedWindow('result', cv2.WINDOW_NORMAL)
cv2.imshow('result', orig)
cv2.waitKey(0)
cv2.imshow('result', dilated)
cv2.waitKey(0)
cv2.imshow('result', edges)
cv2.waitKey(0)
cv2.imshow('result', new)
cv2.waitKey(0)
cv2.destroyAllWindows()
Not perfect, but at least works for all samples:
A student group at my university recently demonstrated an iPhone app (and python OpenCV app) that they'd written to do exactly this. As I remember, the steps were something like this:
Median filter to completely remove the text on the paper (this was handwritten text on white paper with fairly good lighting and may not work with printed text, it worked very well). The reason was that it makes the corner detection much easier.
Hough Transform for lines
Find the peaks in the Hough Transform accumulator space and draw each line across the entire image.
Analyse the lines and remove any that are very close to each other and are at a similar angle (cluster the lines into one). This is necessary because the Hough Transform isn't perfect as it's working in a discrete sample space.
Find pairs of lines that are roughly parallel and that intersect other pairs to see which lines form quads.
This seemed to work fairly well and they were able to take a photo of a piece of paper or book, perform the corner detection and then map the document in the image onto a flat plane in almost realtime (there was a single OpenCV function to perform the mapping). There was no OCR when I saw it working.
Instead of starting from edge detection you could use Corner detection.
Marvin Framework provides an implementation of Moravec algorithm for this purpose. You could find the corners of the papers as a starting point. Below the output of Moravec's algorithm:
Also you can use MSER (Maximally stable extremal regions) over Sobel operator result to find the stable regions of the image. For each region returned by MSER you can apply convex hull and poly approximation to obtain some like this:
But this kind of detection is useful for live detection more than a single picture that not always return the best result.
After edge-detection, use Hough Transform.
Then, put those points in an SVM(supporting vector machine) with their labels, if the examples have smooth lines on them, SVM will not have any difficulty to divide the necessary parts of the example and other parts. My advice on SVM, put a parameter like connectivity and length. That is, if points are connected and long, they are likely to be a line of the receipt. Then, you can eliminate all of the other points.
Here you have #Vanuan 's code using C++:
cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);
cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);
std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
cv::Vec4i l = *it;
cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
if (cv::arcLength(contours[i], false) > 100)
contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;
for (int i=0; i < contoursCleaned.size(); i++) {
if (cv::contourArea(contoursCleaned[i]) > 10000){
contoursArea.push_back(contoursCleaned[i]);
}
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);
Convert to lab space
Use kmeans segment 2 cluster
Then use contours or hough on one of the clusters (intenral)