I need find edges of document that in user hands.
1) Original image from camera:
2) Then i convert image to BG:
3) Then i make blur:
3) Finds edges in an image using the Canny:
4) And use dilate :
As you can see on the last image the contour around the map is torn and the contour is not determined. What is my error and how to solve the problem in order to determine the outline of the document completely?
This is code how i to do it:
final Mat mat = new Mat();
sourceMat.copyTo(mat);
//convert the image to black and white
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_BGR2GRAY);
//blur to enhance edge detection
Imgproc.GaussianBlur(mat, mat, new Size(5, 5), 0);
if (isClicked) saveImageFromMat(mat, "blur", "blur");
//convert the image to black and white does (8 bit)
int thresh = 128;
Imgproc.Canny(mat, mat, thresh, thresh * 2);
//dilate helps to connect nearby line segments
Imgproc.dilate(mat, mat,
Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(3, 3)),
new Point(-1, -1),
2,
1,
new Scalar(1));
This answer is based on my above comment. If someone is holding the document, you cannot see the edge that is behind the user's hand. So, any method for detecting the outline of the document must be robust to some missing parts of the edge.
I suggest using a variant of the Hough transform to detect the document. The Wikipedia article about the Hough transform makes it sound quite scary (as Wikipedia often does with mathematical subjects), but don't be discouraged, actually they are not too difficult to understand or implement.
The original Hough transform detected straight lines in images. As explained in this OpenCV tutorial, any straight line in an image can be defined by 2 parameters: an angle θ and a distance r of the line from the origin. So you quantize these 2 parameters, and create a 2D array with one cell for every possible line that could be present in your image. (The finer the quantization you use, the larger the array you will need, but the more accurate the position of the found lines will be.) Initialize the array to zeros. Then, for every pixel that is part of an edge detected by Canny, you determine every line (θ,r) that the pixel could be part of, and increment the corresponding bin. After processing all pixels, you will have, for each bin, a count of how many pixels were detected on the line corresponding to that bin. Counts which are high enough probably represent real lines in the image, even if parts of the line are missing. So you just scan through the bins to find bins which exceed the threshold.
OpenCV contains Hough detectors for straight lines and circles, but not for rectangles. You could either use the line detector and check for 4 lines that form the edges of your document; or you could write your own Hough detector for rectangles, perhaps using the paper Jung 2004 for inspiration. Rectangles have at least 5 degrees of freedom (2D position, scale, aspect ratio, and rotation angle), and memory requirement for a 5D array obviously goes up pretty fast. But since the range of each parameter is limited (ie, the document's aspect ratio is known, and you can assume the document will be well centered and not rotated much) it is probably feasible.
Related
I'm trying to build an algorithm that calculates the dimensions of slabs (in pixel units as of now). I tried masking, but there is no one HSV color range that will work for all the test cases, as the slabs are of varying colors. I tried Otsu thresholding as well but it didn't work quite well...
Now I'm trying my hand with canny edge detection. The original image, and the image after canny-edge look like this:
I used dilation to make the central region a uniform white region, and then used contour detection. I identified the contour having the maximum area as the contour of interest. The resulting contours are a bit noisy, because the canny edge detection also included some background stuff that was irrelevant:
I used cv2.boundingRect() to estimate the height and width of the rectangle, but it keeps returning the height and width of the entire image. I presume this is because it works by calculating (max(x)-min(x),max(y)-min(y)) for each (x,y) in the contour, and in my case the resulting contour has some pixels touching the edges of the image, and so this calculation simply results in (image width, image height).
I am trying to get better images to work with, but assuming all images are like this only, i.e. have noisy contours, what can be an alternate approach to detect the dimensions of the white rectangular region obtained after dilating?
To get the right points of the rectangle use this:
p = cv2.arcLength(cnt True) # cnt is the rect Contours
appr = cv2.approxPolyDP(cnt , 0.01 * p, True) # appr contains the 4 points
# draw the rect
cv2.drawContours(img, [appr], 0, (0, 255, 0), 2)
The appr var contains the turning point of the rect. You still need to do some more cleaning to get better results, but cv2.boundingRect() is not a good solution for your case.
The image below has many circles. Click and zoom in to see the circles.
https://drive.google.com/open?id=1ox3kiRX5hf2tHDptWfgcbMTAHKCDizSI
What I want is counting the circles using any free language, such as python.
Is there a function or idea to do it?
Edit: I came up with a better solution, partially inspired by this answer below. I thought of this method originally (as noted in the OP comments) but I decided against it. The original image was just not good enough quality for it. However I improved that method and it works brilliantly for the better quality image. The original approach is first, and then the new approach at the bottom.
First approach
So here's a general approach that seems to work well, but definitely just gives estimates. This assumes that circles are roughly the same size.
First, the image is mostly blue---so it seems reasonable to just do the analysis on the blue channel. Thresholding the blue channel, in this case, using Otsu thresholding (which determines an optimal threshold value without input) seems to work very well. This isn't too much of a surprise since the distribution of color values is pretty much binary. Check the mask that results from it!
Then, do a connected component analysis on the mask to get the area of each component (component = white blob in the mask). The statistics returned from connectedComponentsWithStats() give (among other things) the area, which is exactly what we need. Then we can simply count the circles by estimating how many circles fit in a given component based on its area. Also note that I'm taking the statistics for every label except the first one: this is the background label 0, and not any of the white blobs.
Now, how large in area is a single circle? It would be best to let the data tell us. So you could compute a histogram of all the areas, and since there are more single circles than anything else, there will be a high concentration around 250-270 pixels or so for the area. Or you could just take an average of all the areas between something like 50 and 350 which should also get you in a similar ballpark.
Really in this histogram you can see the demarcations between single circles, double circles, triple, and so on quite easily. Only the larger components will give pretty rough estimates. And in fact, the area doesn't seem to scale exactly linearly. Blobs of two circles are slightly larger than two single circles, and blobs of three are larger still than three single circles, and so on, so this makes it a little difficult to estimate nicely, but rounding should still keep us close. If you want you could include a small multiplication parameter that increases as the area increases to account for that, but that would be hard to quantify without going through the histogram analytically...so, I didn't worry about this.
A single circle area divided by the average single circle area should be close to 1. And the area of a 5-circle group divided by the average circle area should be close to 5. And this also means that small insignificant components, that are 1 or 10 or even 100 pixels in area, will not count towards the total since round(50/avg_circle_size) < 1/2, so those will round down to a count of 0. Thus I should just be able to take all the component areas, divide them by the average circle size, round, and get to a decent estimate by summing them all up.
import cv2
import numpy as np
img = cv2.imread('circles.png')
mask = cv2.threshold(img[:, :, 0], 255, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
stats = cv2.connectedComponentsWithStats(mask, 8)[2]
label_area = stats[1:, cv2.CC_STAT_AREA]
min_area, max_area = 50, 350 # min/max for a single circle
singular_mask = (min_area < label_area) & (label_area <= max_area)
circle_area = np.mean(label_area[singular_mask])
n_circles = int(np.sum(np.round(label_area / circle_area)))
print('Total circles:', n_circles)
This code is simple and effective for rough counts.
However, there are definitely some assumptions here about the groups of circles compared to a normal circle size, and there are issues where circles that are at the boundaries will not be counted correctly (these aren't well defined---a two circle blob that is half cut off will look more like one circle---no clear way to count or not count these with this method). Further I just used automatic thresholding via Otsu here; you could get (probably better) results with more careful color filtering. Additionally in the mask generated by Otsu, some circles that are masked have a few pixels removed from their center. Morphology could add these pixels back in, which would give you a (slightly larger) more accurate area for the single circle components. Either way, I just wanted to give the general idea towards how you could easily estimate this with minimal code.
New approach
Before, the goal was to count circles. This new approach instead counts the centers of the circles. The general idea is you threshold and then flood fill from a background pixel to fill in the background (flood fill works like the paint bucket tool in photo editing apps), that way you only see the centers, as shown in this answer below.
However, this relies on global thresholding, which isn't robust to local lighting changes. This means that since some centers are brighter/darker than others, you won't always get good results with a single threshold.
Here I've created an animation to show looping through different threshold values; watch as some centers appear and disappear at different times, meaning you get different counts depending on the threshold you choose (this is just a small patch of the image, it happens everywhere):
Notice that the first blob to appear in the top left actually disappears as the threshold increases. However, if we actually OR each frame together, then each detected pixel persists:
But now every single speck appears, so we should clean up the mask each frame so that we remove single pixels as they come (otherwise they may build up and be hard to remove later). Simple morphological opening with a small kernel will remove them:
Applied over the whole image, this method works incredibly well and finds almost every single cell. There are only three false positives (detected blob that's not a center) and two misses I can spot, and the code is very simple. The final thing to do after the mask has been created is simply count the components, minus one for the background. The only user input required here is a single point to flood fill from that is in the background (seed_pt in the code).
img = cv2.imread('circles.png', 0)
seed_pt = (25, 25)
fill_color = 0
mask = np.zeros_like(img)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
for th in range(60, 120):
prev_mask = mask.copy()
mask = cv2.threshold(img, th, 255, cv2.THRESH_BINARY)[1]
mask = cv2.floodFill(mask, None, seed_pt, fill_color)[1]
mask = cv2.bitwise_or(mask, prev_mask)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
n_centers = cv2.connectedComponents(mask)[0] - 1
print('There are %d cells in the image.'%n_centers)
There are 874 cells in the image.
One possible solution would be to read the image using OpenCV, get its grayscale, then use Canny edge detection and perform countour finding in OpenCV. This will return a list of countours. It would look something like:
import cv2
image = cv2.imread('path-to-your-image')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# tweak the parameters of the GaussianBlur for best performance
blurred = cv2.GaussianBlur(gray, (7, 7), 0)
# again, try different values here
edged = cv2.Canny(blurred, 20, 140)
(_, contours, _) = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(len(contours))
If you have all images like this - consider thresholding it, not necessarily by auto threshold-seeking algorithm like Otsu, but rather using simplest threshold by a given threshold value. Yes, before thresholding you have to convert your color input to gray-scale, or take one of color channels. Then based on few experiments with channels and threshold values - determine threshold value to have circles with holes in monochrome thresholding result. Based on your png image I found value of 81 (intensity of gray varies from 0 to 255) to be great to threshold gray-scale version of your input to have such binary image with holes in place, as described above.
Then simply count those holes.
Holes can be determined by seed-filling white area, connected to image border. As result you will have white hole connected components on black background - so simply count them.
More details you can find here http://www.leptonica.com/filling.html and use leptonica primitives to do thresholding, hole counting an so on.
I have the following image.
this image
I would like to remove the orange boxes/rectangle around numbers and keep the original image clean without any orange grid/rectangle.
Below is my current code but it does not remove it.
Mat mask = new Mat();
Mat src = new Mat();
src = Imgcodecs.imread("enveloppe.jpg",Imgcodecs.CV_LOAD_IMAGE_COLOR);
Imgproc.cvtColor(src, hsvMat, Imgproc.COLOR_BGR2HSV);
Scalar lowerThreshold = new Scalar(0, 50, 50);
Scalar upperThreshold = new Scalar(25, 255, 255);
Mat mask = new Mat();
Core.inRange(hsvMat, lowerThreshold, upperThreshold, mask);
//src.setTo(new scalar(255,255,255),mask);
what to do next ?
How can i remove the orange boxes/rectangle from the original images ?
Update:
For information , the mask contains exactly all the boxes/rectangle that i want to remove. I don't know how to use this mask to remove boxes/rectangle from the source (src) image as if they were not present.
This is what I did to solve the problem. I solved the problem in C++ and I used OpenCV.
Part 1: Find box candidates
Firstly I wanted to isolate the signal that was specific for red channel. I splitted the image into three channels. I then subtracted the red channel from blue channel and the red from green channel. After that I subtracted both previous subtraction results from one another. The final subtraction result is shown on the image below.
using namespace cv;
using namespace std;
Mat src_rgb = imread("image.jpg");
std::vector<Mat> channels;
split(src_rgb, channels);
Mat diff_rb, diff_rg;
subtract(channels[2], channels[0], diff_rb);
subtract(channels[2], channels[1], diff_rg);
Mat diff;
subtract(diff_rb, diff_rg, diff);
My next goal was to divide the parts of obtained image into separate "groups". To do that, I smoothed the image a little bit with a Gaussian filter. Then I applied a threshold to obtain a binary image; finally I looked for external contours within that image.
GaussianBlur(diff, diff, cv::Size(11, 11), 2.0, 2.0);
threshold(diff, diff, 5, 255, THRESH_BINARY);
vector<vector<Point>> contours;
findContours(diff, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE);
Click to see subtraction result, Gaussian blurred image, thresholded image and detected contours.
Part 2: Inspect box candidates
After that, I had to make an estimate whether the interior of each contour contained a number or something else. I made an assumption that numbers will always be printed with black ink and that they will have sharp edges. Therefore I took a blue channel image and I applied just a little bit of Gaussian smoothing and convolved it with a Laplacian operator.
Mat blurred_ch2;
GaussianBlur(channels[2], blurred_ch2, cv::Size(7, 7), 1, 1);
Mat laplace_result;
Laplacian(blurred_ch2, laplace_result, -1, 1);
I then took the resulting image and applied the following procedure for every contour separately. I computed a standard deviation of the pixel values within the contour interior. Standard deviation was high inside the contours that surrounded numbers; and it was low inside the two contours that surrounded the dog's head and the letters on top of the stamp.
That is why I could appliy the standard deviation threshold. Standard deviation was approx. twice larger for contours containing numbers so this was an easy way to only select the contours that contained numbers. Then I drew the contour interior mask. I used erosion and subtraction to obtain the "box edge mask".
The final step was fairly easy. I computed an estimate of average pixel value nearby the box on every channel of the image. Then I changed all pixel values under the "box edge mask" to those values on every channel. After I repeated that procedure for every box contour, I merged all three channels into one.
Mat mask(src_rgb.size(), CV_8UC1);
for (int i = 0; i < contours.size(); ++i)
{
mask.setTo(0);
drawContours(mask, contours, i, cv::Scalar(200), -1);
Scalar mean, stdev;
meanStdDev(laplace_result, mean, stdev, mask);
if (stdev.val[0] < 10.0) continue;
Mat eroded;
erode(mask, eroded, cv::Mat(), cv::Point(-1, -1), 6);
subtract(mask, eroded, mask);
for (int c = 0; c < src_rgb.channels(); ++c)
{
erode(mask, eroded, cv::Mat());
subtract(mask, eroded, eroded);
Scalar mean, stdev;
meanStdDev(channels[c], mean, stdev, eroded);
channels[c].setTo(mean, mask);
}
}
Mat final_result;
merge(channels, final_result);
imshow("Final Result", final_result);
Click to see red channel of the image, the result of convolution with Laplacian operator, drawn mask of the box edges and the final result.
Please note
This code is far from being optimal, especially the last loop does quite a lot of unnecessary work. But I think that in this case readability is more important (and the author of the question did not request an optimized solution anyway).
Looking towards more general solution
After I posted the initial reply, the author of the question noted that the digits can be of any color and their edges are not necessarily sharp. That means that above procedure can fail because of various reasons. I altered the input image so that it contains different kinds of numbers (click to see the image) and you can run my algorithm on this input and analyze what goes wrong.
The way I see it, one of these approaches is needed (or perhaps a mixture of both) to obtain a more "general" solution:
concentrate only on rectangle shape and color (confirm that the box candidate is really an orange box and remove it regardless of what is inside)
concentrate on numbers only (run a proper number detection algorithm inside the interior of every box candidate; if it contains a single number, remove the box)
I will give a trivial example of the first approach. If you can assume that orange box size will always be the same, just check the box size instead of standard deviation of the signal in the last loop of the algorithm:
Rect rect = boundingRect(contours[i]);
float area = rect.area();
if (area < 1000 || area > 1200) continue;
Warning: actual area of rectangles is around 600Px^2, but I took into account the Gaussian Blurring, which caused the contour to expand. Please also note that if you use this approach you no longer need to perform blurring or laplace operations on blue channel image.
You can also add other simple constraints to that condition; ratio between width and height is the first one that comes to my mind. Geometric properties can also be a good option (right angles, straight edges, convexness ...).
So I am very new to OpenCV (2.1), so please keep that in mind.
So I managed to calibrate my cheap web camera that I am using (with a wide angle attachment), using the checkerboard calibration method to produce the intrinsic and distortion coefficients.
I then have no trouble feeding these values back in and producing image maps, which I then apply to a video feed to correct the incoming images.
I run into an issue however. I know when it is warping/correcting the image, it creates several skewed sections, and then formats the image to crop out any black areas. My question then is can I view the complete warped image, including some regions that have black areas? Below is an example of the black regions with skewed sections I was trying to convey if my terminology was off:
An image better conveying the regions I am talking about can be found here! This image was discovered in this post.
Currently: The cvRemap() returns basically the yellow box in the image linked above, but I want to see the whole image as there is relevant data I am looking to get out of it.
What I've tried: Applying a scale conversion to the image map to fit the complete image (including stretched parts) into frame
CvMat *intrinsic = (CvMat*)cvLoad( "Intrinsics.xml" );
CvMat *distortion = (CvMat*)cvLoad( "Distortion.xml" );
cvInitUndistortMap( intrinsic, distortion, mapx, mapy );
cvConvertScale(mapx, mapx, 1.25, -shift_x); // Some sort of scale conversion
cvConvertScale(mapy, mapy, 1.25, -shift_y); // applied to the image map
cvRemap(distorted,undistorted,mapx,mapy);
The cvConvertScale, when I think I have aligned the x/y shift correctly (guess/checking), is somehow distorting the image map making the correction useless. There might be some math involved here I am not correctly following/understanding.
Does anyone have any other suggestions to solve this problem, or what I might be doing wrong? I've also tried trying to write my own code to fix distortion issues, but lets just say OpenCV knows already how to do it well.
From memory, you need to use InitUndistortRectifyMap(cameraMatrix,distCoeffs,R,newCameraMatrix,map1,map2), of which InitUndistortMap is a simplified version.
cvInitUndistortMap( intrinsic, distort, map1, map2 )
is equivalent to:
cvInitUndistortRectifyMap( intrinsic, distort, Identity matrix, intrinsic,
map1, map2 )
The new parameters are R and newCameraMatrix. R species an additional transformation (e.g. rotation) to perform (just set it to the identity matrix).
The parameter of interest to you is newCameraMatrix. In InitUndistortMap this is the same as the original camera matrix, but you can use it to get that scaling effect you're talking about.
You get the new camera matrix with GetOptimalNewCameraMatrix(cameraMat, distCoeffs, imageSize, alpha,...). You basically feed in intrinsic, distort, your original image size, and a parameter alpha (along with containers to hold the result matrix, see documentation). The parameter alpha will achieve what you want.
I quote from the documentation:
The function computes the optimal new camera matrix based on the free
scaling parameter. By varying this parameter the user may retrieve
only sensible pixels alpha=0, keep all the original image pixels if
there is valuable information in the corners alpha=1, or get something
in between. When alpha>0, the undistortion result will likely have
some black pixels corresponding to “virtual” pixels outside of the
captured distorted image. The original camera matrix, distortion
coefficients, the computed new camera matrix and the newImageSize
should be passed to InitUndistortRectifyMap to produce the maps for
Remap.
So for the extreme example with all the black bits showing you want alpha=1.
In summary:
call cvGetOptimalNewCameraMatrix with alpha=1 to obtain newCameraMatrix.
use cvInitUndistortRectifymap with R being identity matrix and newCameraMatrix set to the one you just calculated
feed the new maps into cvRemap.
I'm trying to use OpenCV to "parse" screenshots from the iPhone game Blocked. The screenshots are cropped to look like this:
I suppose for right now I'm just trying to find the coordinates of each of the 4 points that make up each rectangle. I did see the sample file squares.c that comes with OpenCV, but when I run that algorithm on this picture, it comes up with 72 rectangles, including the rectangular areas of whitespace that I obviously don't want to count as one of my rectangles. What is a better way to approach this? I tried doing some Google research, but for all of the search results, there is very little relevant usable information.
The similar issue has already been discussed:
How to recognize rectangles in this image?
As for your data, rectangles you are trying to find are the only black objects. So you can try to do a threshold binarization: black pixels are those ones which have ALL three RGB values less than 40 (I've found it empirically). This simple operation makes your picture look like this:
After that you could apply Hough transform to find lines (discussed in the topic I referred to), or you can do it easier. Compute integral projections of the black pixels to X and Y axes. (The projection to X is a vector of x_i - numbers of black pixels such that it has the first coordinate equal to x_i). So, you get possible x and y values as the peaks of the projections. Then look through all the possible segments restricted by the found x and y (if there are a lot of black pixels between (x_i, y_j) and (x_i, y_k), there probably is a line probably). Finally, compose line segments to rectangles!
Here's a complete Python solution. The main idea is:
Apply pyramid mean shift filtering to help threshold accuracy
Otsu's threshold to get a binary image
Find contours and filter using contour approximation
Here's a visualization of each detected rectangle contour
Results
import cv2
image = cv2.imread('1.png')
blur = cv2.pyrMeanShiftFiltering(image, 11, 21)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
x,y,w,h = cv2.boundingRect(approx)
cv2.rectangle(image,(x,y),(x+w,y+h),(36,255,12),2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
I wound up just building on my original method and doing as Robert suggested in his comment on my question. After I get my list of rectangles, I then run through and calculate the average color over each rectangle. I check to see if the red, green, and blue components of the average color are each within 10% of the gray and blue rectangle colors, and if they are I save the rectangle, if they aren't I discard it. This process gives me something like this:
From this, it's trivial to get the information I need (orientation, starting point, and length of each rectangle, considering the game window as a 6x6 grid).
The blocks look like bitmaps - why don't you use simple template matching with different templates for each block size/color/orientation?
Since your problem is the small rectangles I would start by removing them.
Since those lines are much thinner than the borders of the rectangles I would start by applying morphological operations on the image.
Using a structural element that looks like this:
element = [ 1 1
1 1 ]
should remove lines that are less than two pixels wide. After the small lines are removed the rectangle finding algorithm of OpenCV will most likely do the rest of the job for you.
The erosion can be done in OpenCV by the function cvErode
Try one of the many corner detectors like harris corner detector. also it is in general a good idea to try that at multiple resolutions : so do some preprocessing of of varying magnification.
It appears that you want some sort of color dominated square then you can suppress the other colors, by first using something like cvsplit .....and then thresholding the color...so only that region remains....follow that with a cropping operation ...I think that could work as well ....