I am trying to create an app that can read text from image. But i'm having problem in clearing background. I want results like :
Input Image 1 :
Output Image 1 :
This is the code I have tried:
cvtColor(org, tmp, CV_BGR2GRAY);
normalize(tmp, tmp, 0, 255, NORM_MINMAX);
threshold(tmp, dst, 0, 255, CV_THRESH_OTSU);
The lines that interest you are oriented at either 0 or 90 degrees, with a small variance in either direction. Lines in the background patterns are slanted. You can identify the lines with the canny algorithm, then check orientation. You'll be left with some gaps where the vertical and horizontal lines meet, depending on the font. Then return to the original image and use a watershed based on color, or use connected components, or whatever to avoid losing those connecting regions.
Related
I need find edges of document that in user hands.
1) Original image from camera:
2) Then i convert image to BG:
3) Then i make blur:
3) Finds edges in an image using the Canny:
4) And use dilate :
As you can see on the last image the contour around the map is torn and the contour is not determined. What is my error and how to solve the problem in order to determine the outline of the document completely?
This is code how i to do it:
final Mat mat = new Mat();
sourceMat.copyTo(mat);
//convert the image to black and white
Imgproc.cvtColor(mat, mat, Imgproc.COLOR_BGR2GRAY);
//blur to enhance edge detection
Imgproc.GaussianBlur(mat, mat, new Size(5, 5), 0);
if (isClicked) saveImageFromMat(mat, "blur", "blur");
//convert the image to black and white does (8 bit)
int thresh = 128;
Imgproc.Canny(mat, mat, thresh, thresh * 2);
//dilate helps to connect nearby line segments
Imgproc.dilate(mat, mat,
Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(3, 3)),
new Point(-1, -1),
2,
1,
new Scalar(1));
This answer is based on my above comment. If someone is holding the document, you cannot see the edge that is behind the user's hand. So, any method for detecting the outline of the document must be robust to some missing parts of the edge.
I suggest using a variant of the Hough transform to detect the document. The Wikipedia article about the Hough transform makes it sound quite scary (as Wikipedia often does with mathematical subjects), but don't be discouraged, actually they are not too difficult to understand or implement.
The original Hough transform detected straight lines in images. As explained in this OpenCV tutorial, any straight line in an image can be defined by 2 parameters: an angle θ and a distance r of the line from the origin. So you quantize these 2 parameters, and create a 2D array with one cell for every possible line that could be present in your image. (The finer the quantization you use, the larger the array you will need, but the more accurate the position of the found lines will be.) Initialize the array to zeros. Then, for every pixel that is part of an edge detected by Canny, you determine every line (θ,r) that the pixel could be part of, and increment the corresponding bin. After processing all pixels, you will have, for each bin, a count of how many pixels were detected on the line corresponding to that bin. Counts which are high enough probably represent real lines in the image, even if parts of the line are missing. So you just scan through the bins to find bins which exceed the threshold.
OpenCV contains Hough detectors for straight lines and circles, but not for rectangles. You could either use the line detector and check for 4 lines that form the edges of your document; or you could write your own Hough detector for rectangles, perhaps using the paper Jung 2004 for inspiration. Rectangles have at least 5 degrees of freedom (2D position, scale, aspect ratio, and rotation angle), and memory requirement for a 5D array obviously goes up pretty fast. But since the range of each parameter is limited (ie, the document's aspect ratio is known, and you can assume the document will be well centered and not rotated much) it is probably feasible.
Lets say I have the following image where there is a folder image with a white label on it.
What I want is to detect the coordinates of end points of the folder and the white paper on it (both rectangles).
Using the coordinates, I want to know the exact place of the paper on the folder.
GIVEN :
The inner white paper rectangle is always going to be of the fixed size, so may be we can use this knowledge somewhere?
I am new to opencv and trying to find some guidance around how should I approach this problem?
Problem Statement : We cannot rely on color based solution since this is just an example and color of both the folder as well as the rectangular paper can change.
There can be other noisy papers too but one thing is given, The overall folder and the big rectangular paper would always be the biggest two rectangles at any given time.
I have tried opencv canny for edge detection and it looks like this image.
Now how can I find the coordinates of outer rectangle and inner rectangle.
For this image, there are three domain colors: (1) the background-yellow (2) the folder-blue (3) the paper-white. Use the color info may help, I analysis it in RGB and HSV like this:
As you can see(the second row, the third cell), the regions can be easily seperated in H(HSV) if you find the folder mask first.
We can choose
My steps:
(1) find the folder region mask in HSV using inRange(hsv, (80, 10, 20), (150, 255, 255))
(2) find contours on the mask and filter them by width and height
Here is the result:
Related:
Choosing the correct upper and lower HSV boundaries for color detection with`cv::inRange` (OpenCV)
How to define a threshold value to detect only green colour objects in an image :Opencv
You can opt for (Adaptive Threshold)[https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html]
Obtain the hue channel of the image.
Perform adaptive threshold with a certain block size. I used size of 15 for half the size of the image.
This is invariant to color as you expected. Now you can go ahead and extract what you need!!
This solution helps to identify the white paper region of the image.
This is the full code for the solution:
import cv2
import numpy as np
image = cv2.imread('stack2.jpg',-1)
paper = cv2.resize(image,(500,500))
ret, thresh_gray = cv2.threshold(cv2.cvtColor(paper, cv2.COLOR_BGR2GRAY),
200, 255, cv2.THRESH_BINARY)
image, contours, hier = cv2.findContours(thresh_gray, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
for c in contours:
area = cv2.contourArea(c)
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
# convert all coordinates floating point values to int
box = np.int0(box)
# draw a green 'nghien' rectangle
if area>500:
cv2.drawContours(paper, [box], 0, (0, 255, 0),1)
print([box])
cv2.imshow('paper', paper)
cv2.imwrite('paper.jpg',paper)
cv2.waitKey(0)
First using a manual threshold(200) you can detect paper in the image.
ret, thresh_gray = cv2.threshold(cv2.cvtColor(paper, cv2.COLOR_BGR2GRAY), 200, 255, cv2.THRESH_BINARY)
After that you should find contours and get the minAreaRect(). Then you should get coordinates for that rectangle(box) and draw it.
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(paper, [box], 0, (0, 255, 0),1)
In order to avoid small white regions of the image you can use area = cv2.contourArea(c) and check if area>500 and drawContours().
final output:
Console output gives coordinates for the white paper.
console output:
[array([[438, 267],
[199, 256],
[209, 60],
[447, 71]], dtype=int64)]
I am trying to detect the white shapes in an object and can successfully do it for 1 video.
// Create and display a new matrix for triangles
triangles = src.clone();
GaussianBlur(triangles, triangles, Size(5, 5), 0, 0);
inRange(triangles, Scalar(150,150,150), Scalar(255, 255, 255), triangles);
imshow("triangles", triangles);
This gives me the result
http://s8.postimg.org/o9xg284jp/triangles.png
However, if I use a different video - then the scalar value of 150 may not be appropriate (for example if it is a light environment... everything gets detected)
http://s8.postimg.org/m09brgvlx/bad_triangles.png
For this video I would need to change the minimum scalar to be around 190-200 for it to work properly. My question - is there a good way to determine the correct scalar value to use? I know it sounds simple to some, but ive got a headache because of it!
http://colorizer.org/
If you check here you can see what your problem is. RGB = (255, 155, 155) is probably not a "white" but your inRange method is giving true output to that one.
Try to use HSL color space. Lightness > 90 is white for sure, no matter what H and S channel values are. Use BGR2HLS conversion. Then use inRange with L channel between 90-100.
Actually, for color detection problems, mostly used color spaces are HSV and HSL, not RGB!
There is probably no way to automatically determine a threshold that works for all kind of videos. But to make it less dependent on the overall lightning of the video you could make it depend on the mean or median pixel value of the image.
Or if you know how big your object appears in the image, you could choose the threshold accordingly.
Another approach could be to normalize the brightness of the video.
But which approach is best strongly dependents on your exact situation and requirements.
I have a image which is multi colored.
I want to calculate the dominant color of the image. the dominant color is red, i want to filter the red out. i am doing the following code in opencv but its not performing.
inRange(input_image, Scalar(0, 0, 0), Scalar(0, 0, 255), output);
How can i get the dominant color otherwise? My final project should determine the maximum color of the object on its own. What is the best method for this?
You should quantize (reduce number of colors) your image before searching the for the most frequent color.
Why? Imagine image that has 100 pixels of (0,0,255) (blue color int RGB), 100 pixels of (0,0,254) (almost blue - you even won't find the difference) and 150 pixels of (0,255,0) (green). What is the most frequent color here? Obviously, it's green. But after quantization you will got 200 pixels of blue and 150 pixels of green.
Read this discussion: How to reduce the number of colors in an image with OpenCV?. Here's simple example:
int coef = 200;
Mat quantized = img/coef;
quantized = quantized*coef;
And this is what I've got after applying it:
Also you can use k-means or mean-shift to do that (this is much efficient way).
The best method is by analyzing histograms.
Your problem is a classical "find the peak and area under the peak". By having an image file (let's say we take only the third channel for simplicity):
You will have to find the highest peak in that histogram. The easiest method is to simply query the X for which Y is maximized. More advanced methods work with windows - they average the Y-values of 10 consecutive data points, etc.
Also, work in the HSV or YCrCb color space. HSV is good because the "Hue" channel translates very closely to what you mean by "Color". RGB is really not well suited for image analysis.
I'm writing an Android app in OpenCV to detect blobs. One task is to threshold the image to differentiate the foreground objects from the background (see image).
It works fine as long as the image is known and I can manually pass a threshold value to threshold()--in this particular image say, 200. But assuming that the image is not known with the only knowledge that there would be a dark solid background and lighter foreground objects how can I dynamically figure out the threshold value?
I've come across the histogram where I can compute the intensity distribution of the grayscale image. But I couldn't find a method to analyze the histogram and choose the value where the objects of interest (lighter) lies. That is; I want to differ the obviously dark background spikes from the lighter foreground spikes--in this case above 200, but in another case could be say, 100 if the objects are grayish.
If all your images are like this, or can be brought to this style, i think cv2.THRESHOLD_OTSU, ie otsu's tresholding algorithm is a good shot.
Below is a sample using Python in command terminal :
>>> import cv2
>>> import numpy as np
>>> img2 = cv2.imread('D:\Abid_Rahman_K\work_space\sofeggs.jpg',0)
>>> ret,thresh = cv2.threshold(img2,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
>>> ret
122.0
ret is the threshold value which is automatically calculated. We just pass '0' as threshold value for this.
I got 124 in GIMP ( which is comparable to result we got). And it also removes the noise. See result below:
If you say that the background is dark (black) and the foreground is lighter, then I recommend to use the YUV color space (or any other YXX like YCrCb, etc.), because the first component of such color spaces is luminance (or lightning).
So after the Y channel is extracted (via the extractChennel function) we need to analyse the histogram of this channel (image):
See the first (left) hump? It represents dark areas (the background in your situation) on your image. So our aim now is to find a segment (on abscissa, it's red part in the image) that contains this hump. Obviously the left point of this segment is zero. The right point is the first point where:
the (local) maximum of histogram is from the left of the point
the value of histogram is less than some small epsilon (you can set it to 10)
I drew a green vertical line to show the location of the right point of the segment in this histogram.
And that's it! This right point of the segment is the needed threshold. Here's the result (epsilon is 10 and the calculated threshold is 50):
I think that it's not a problem for you to delete the noise in the image above.
The following is a C++ implementation of Abid's answer that works with OpenCV 3.x:
// Convert the source image to a 1 channel grayscale:
Mat gray;
cvtColor(src, gray, CV_BGR2GRAY);
// Apply the threshold function with the CV_THRESH_OTSU setting as well
// You can skip having it return the value, but I include it for showing the
// results from OTSU
double thresholdValue = threshold(gray, gray, 0, 255, CV_THRESH_BINARY+CV_THRESH_OTSU);
// Present the threshold value
printf("Threshold value: %f\n", thresholdValue);
Running this against the original image, I get the following:
OpenCV calculated a threshold value of 122 for it, close to the value Abid found in his answer.
Just to verify, I altered the original image as seen here:
And produced the following, with a new threshold value of 178: