Extracting text from an image with tesseract [duplicate] - opencv

I asked a similar question here but that is focused more on tesseract.
I have a sample image as below. I would like to make the white square my Region of Interest and then crop out that part (square) and create a new image with it. I will be working with different images so the square won't always be at the same location in all images. So I will need to somehow detect the edges of the square.
What are some pre-processing methods I can perform to achieve the result?

Using your test image I was able to remove all the noises with a simple erosion operation.
After that, a simple iteration on the Mat to find for the corner pixels is trivial, and I talked about that on this answer. For testing purposes we can draw green lines between those points to display the area we are interested at in the original image:
At the end, I set the ROI in the original image and crop out that part.
The final result is displayed on the image below:
I wrote a sample code that performs this task using the C++ interface of OpenCV. I'm confident in your skills to translate this code to Python. If you can't do it, forget the code and stick with the roadmap I shared on this answer.
#include <cv.h>
#include <highgui.h>
int main(int argc, char* argv[])
{
cv::Mat img = cv::imread(argv[1]);
std::cout << "Original image size: " << img.size() << std::endl;
// Convert RGB Mat to GRAY
cv::Mat gray;
cv::cvtColor(img, gray, CV_BGR2GRAY);
std::cout << "Gray image size: " << gray.size() << std::endl;
// Erode image to remove unwanted noises
int erosion_size = 5;
cv::Mat element = cv::getStructuringElement(cv::MORPH_CROSS,
cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),
cv::Point(erosion_size, erosion_size) );
cv::erode(gray, gray, element);
// Scan the image searching for points and store them in a vector
std::vector<cv::Point> points;
cv::Mat_<uchar>::iterator it = gray.begin<uchar>();
cv::Mat_<uchar>::iterator end = gray.end<uchar>();
for (; it != end; it++)
{
if (*it)
points.push_back(it.pos());
}
// From the points, figure out the size of the ROI
int left, right, top, bottom;
for (int i = 0; i < points.size(); i++)
{
if (i == 0) // initialize corner values
{
left = right = points[i].x;
top = bottom = points[i].y;
}
if (points[i].x < left)
left = points[i].x;
if (points[i].x > right)
right = points[i].x;
if (points[i].y < top)
top = points[i].y;
if (points[i].y > bottom)
bottom = points[i].y;
}
std::vector<cv::Point> box_points;
box_points.push_back(cv::Point(left, top));
box_points.push_back(cv::Point(left, bottom));
box_points.push_back(cv::Point(right, bottom));
box_points.push_back(cv::Point(right, top));
// Compute minimal bounding box for the ROI
// Note: for some unknown reason, width/height of the box are switched.
cv::RotatedRect box = cv::minAreaRect(cv::Mat(box_points));
std::cout << "box w:" << box.size.width << " h:" << box.size.height << std::endl;
// Draw bounding box in the original image (debugging purposes)
//cv::Point2f vertices[4];
//box.points(vertices);
//for (int i = 0; i < 4; ++i)
//{
// cv::line(img, vertices[i], vertices[(i + 1) % 4], cv::Scalar(0, 255, 0), 1, CV_AA);
//}
//cv::imshow("Original", img);
//cv::waitKey(0);
// Set the ROI to the area defined by the box
// Note: because the width/height of the box are switched,
// they were switched manually in the code below:
cv::Rect roi;
roi.x = box.center.x - (box.size.height / 2);
roi.y = box.center.y - (box.size.width / 2);
roi.width = box.size.height;
roi.height = box.size.width;
std::cout << "roi # " << roi.x << "," << roi.y << " " << roi.width << "x" << roi.height << std::endl;
// Crop the original image to the defined ROI
cv::Mat crop = img(roi);
// Display cropped ROI
cv::imshow("Cropped ROI", crop);
cv::waitKey(0);
return 0;
}

Seeing that the text is the only large blob, and everything else is barely larger than a pixel, a simple morphological opening should suffice
You can do this in opencv
or with imagemagic
Afterwards the white rectangle should be the only thing left in the image. You can find it with opencvs findcontours, with the CvBlobs library for opencv or with the imagemagick -crop function
Here is your image with 2 steps of erosion followed by 2 steps of dilation applied:
You can simply plug this image into the opencv findContours function as in the Squares tutorial example to get the position

input
#objective:
#1)compress large images to less than 1000x1000
#2)identify region of interests
#3)save rois in top to bottom order
import cv2
import os
def get_contour_precedence(contour, cols):
tolerance_factor = 10
origin = cv2.boundingRect(contour)
return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]
# Load image, grayscale, Gaussian blur, adaptive threshold
image = cv2.imread('./images/sample_0.jpg')
#compress the image if image size is >than 1000x1000
height, width, color = image.shape #unpacking tuple (height, width, colour) returned by image.shape
while(width > 1000):
height = height/2
width = width/2
print(int(height), int(width))
height = int(height)
width = int(width)
image = cv2.resize(image, (width, height))
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9,9), 0)
thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,30)
# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
ret,thresh3 = cv2.threshold(image,127,255,cv2.THRESH_BINARY_INV)
dilate = cv2.dilate(thresh, kernel, iterations=4)
# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#cnts = cv2.findContours(thresh3, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
#ORDER CONTOURS top to bottom
cnts.sort(key=lambda x:get_contour_precedence(x, image.shape[1]))
#delete previous roi images in folder roi to avoid
dir = './roi/'
for f in os.listdir(dir):
os.remove(os.path.join(dir, f))
ROI_number = 0
for c in cnts:
area = cv2.contourArea(c)
if area > 10000:
x,y,w,h = cv2.boundingRect(c)
#cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
cv2.rectangle(image, (x, y), (x + w, y + h), (100,100,100), 1)
#use below code to write roi when results are good
ROI = image[y:y+h, x:x+w]
cv2.imwrite('roi/ROI_{}.jpg'.format(ROI_number), ROI)
ROI_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()
roi detection
output

Related

Radius of a disk in a binary image

I have binarized images like this one:
I need to determine the center and radius of the inner solid disk. As you can see, it is surrounded by a textured area which touches it, so that simple connected component detection doesn't work. Anyway, there is a void margin on a large part of the perimeter.
A possible cure could be by eroding until all the texture disappears or disconnects from the disk, but this can be time consuming and the number of iterations is unsure. (In addition, in some unlucky cases there are tiny holes in the disk, which will grow with erosion.)
Any better suggestion to address this problem in a robust and fast way ? (I tagged OpenCV, but this is not mandated, what matters is the approach.)
You can:
Invert the image
Find the largest axis-aligned rectangle containing only zeros, (I used my C++ code from this answer). The algorithm is pretty fast.
Get the center and radius of the circle from the rectangle
Code:
#include <opencv2\opencv.hpp>
using namespace std;
using namespace cv;
// https://stackoverflow.com/a/30418912/5008845
cv::Rect findMaxRect(const cv::Mat1b& src)
{
cv::Mat1f W(src.rows, src.cols, float(0));
cv::Mat1f H(src.rows, src.cols, float(0));
cv::Rect maxRect(0,0,0,0);
float maxArea = 0.f;
for (int r = 0; r < src.rows; ++r)
{
for (int c = 0; c < src.cols; ++c)
{
if (src(r, c) == 0)
{
H(r, c) = 1.f + ((r>0) ? H(r-1, c) : 0);
W(r, c) = 1.f + ((c>0) ? W(r, c-1) : 0);
}
float minw = W(r,c);
for (int h = 0; h < H(r, c); ++h)
{
minw = std::min(minw, W(r-h, c));
float area = (h+1) * minw;
if (area > maxArea)
{
maxArea = area;
maxRect = cv::Rect(cv::Point(c - minw + 1, r - h), cv::Point(c+1, r+1));
}
}
}
}
return maxRect;
}
int main()
{
cv::Mat1b img = cv::imread("path/to/img", cv::IMREAD_GRAYSCALE);
// Correct image
img = img > 127;
cv::Rect r = findMaxRect(~img);
cv::Point center ( std::round(r.x + r.width / 2.f), std::round(r.y + r.height / 2.f));
int radius = std::sqrt(r.width*r.width + r.height*r.height) / 2;
cv::Mat3b out;
cv::cvtColor(img, out, cv::COLOR_GRAY2BGR);
cv::rectangle(out, r, cv::Scalar(0, 255, 0));
cv::circle(out, center, radius, cv::Scalar(0, 0, 255));
return 0;
}
My method is to use morph-open, findcontours, and minEnclosingCircle as follow:
#!/usr/bin/python3
# 2018/11/29 20:03
import cv2
fname = "test.png"
img = cv2.imread(fname)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
th, threshed = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
morphed = cv2.morphologyEx(threshed, cv2.MORPH_OPEN, kernel, iterations = 3)
cnts = cv2.findContours(morphed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
cnt = max(cnts, key=cv2.contourArea)
pt, r = cv2.minEnclosingCircle(cnt)
pt = (int(pt[0]), int(pt[1]))
r = int(r)
print("center: {}\nradius: {}".format(pt, r))
The final result:
center: (184, 170)
radius: 103
My second attempt on this case. This time I am using morphological closing operation to weaken the noise and maintain the signal. This is followed by a simple threshold and a connectedcomponent analysis. I hope this code can run faster.
Using this method, i can find the centroid with subpixel accuracy
('center : ', (184.12244328746746, 170.59771290442544))
Radius is derived from the area of the circle.
('radius : ', 101.34704439389715)
Here is the full code
import cv2
import numpy as np
# load image in grayscale
image = cv2.imread('radius.png',0)
r,c = image.shape
# remove noise
blured = cv2.blur(image,(5,5))
# Morphological closing
morph = cv2.erode(blured,None,iterations = 3)
morph = cv2.dilate(morph,None,iterations = 3)
cv2.imshow("morph",morph)
cv2.waitKey(0)
# Get the strong signal
th, th_img = cv2.threshold(morph,200,255,cv2.THRESH_BINARY)
cv2.imshow("th_img",th_img)
cv2.waitKey(0)
# Get connected components
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(th_img)
print(num_labels)
print(stats)
# displat labels
labels_disp = np.uint8(255*labels/np.max(labels))
cv2.imshow("labels",labels_disp)
cv2.waitKey(0)
# Find center label
cnt_label = labels[r/2,c/2]
# Find circle center and radius
# Radius calculated by averaging the height and width of bounding box
area = stats[cnt_label][4]
radius = np.sqrt(area / np.pi)#stats[cnt_label][2]/2 + stats[cnt_label][3]/2)/2
cnt_pt = ((centroids[cnt_label][0]),(centroids[cnt_label][1]))
print('center : ',cnt_pt)
print('radius : ',radius)
# Display final result
edges_color = cv2.cvtColor(image,cv2.COLOR_GRAY2BGR)
cv2.circle(edges_color,(int(cnt_pt[0]),int(cnt_pt[1])),int(radius),(0,0,255),1)
cv2.circle(edges_color,(int(cnt_pt[0]),int(cnt_pt[1])),5,(0,0,255),-1)
x1 = stats[cnt_label][0]
y1 = stats[cnt_label][1]
w1 = stats[cnt_label][2]
h1 = stats[cnt_label][3]
cv2.rectangle(edges_color,(x1,y1),(x1+w1,y1+h1),(0,255,0))
cv2.imshow("edges_color",edges_color)
cv2.waitKey(0)
Here is an example of using hough circle. It can work if you set the min and max radius to a proper range.
import cv2
import numpy as np
# load image in grayscale
image = cv2.imread('radius.png',0)
r , c = image.shape
# remove noise
dst = cv2.blur(image,(5,5))
# Morphological closing
dst = cv2.erode(dst,None,iterations = 3)
dst = cv2.dilate(dst,None,iterations = 3)
# Find Hough Circle
circles = cv2.HoughCircles(dst
,cv2.HOUGH_GRADIENT
,2
,minDist = 0.5* r
,param2 = 150
,minRadius = int(0.5 * r / 2.0)
,maxRadius = int(0.75 * r / 2.0)
)
# Display
edges_color = cv2.cvtColor(image,cv2.COLOR_GRAY2BGR)
for i in circles[0]:
print(i)
cv2.circle(edges_color,(i[0],i[1]),i[2],(0,0,255),1)
cv2.imshow("edges_color",edges_color)
cv2.waitKey(0)
Here is the result
[185. 167. 103.6]
Have you tried something along the lines of the Circle Hough Transform?
I see that OpenCv has its own implementation. Some preprocessing (median filtering?) might be necessary here, though.
Here is a simple approach:
Erode the image (using a large, circular SE), then find the centroid of the result. This should be really close to the centroid of the central disk.
Compute the mean as a function of the radius of the original image, using the computed centroid as the center.
The output looks like this:
From here, determining the radius is quite simple.
Here is the code, I'm using PyDIP (we don't yet have a binary distribution, you'll need to download and build form sources):
import matplotlib.pyplot as pp
import PyDIP as dip
import numpy as np
img = dip.Image(pp.imread('/home/cris/tmp/FDvQm.png')[:,:,0])
b = dip.Erosion(img, 30)
c = dip.CenterOfMass(b)
rmean = dip.RadialMean(img, center=c)
pp.plot(rmean)
r = np.argmax(rmean < 0.5)
Here, r is 102, as the radius in integer number of pixels, I'm sure it's possible to interpolate to improve precision. c is [184.02, 170.45].

python segment an image of text line by line [duplicate]

I am trying to find a way to break the split the lines of text in a scanned document that has been adaptive thresholded. Right now, I am storing the pixel values of the document as unsigned ints from 0 to 255, and I am taking the average of the pixels in each line, and I split the lines into ranges based on whether the average of the pixels values is larger than 250, and then I take the median of each range of lines for which this holds. However, this methods sometimes fails, as there can be black splotches on the image.
Is there a more noise-resistant way to do this task?
EDIT: Here is some code. "warped" is the name of the original image, "cuts" is where I want to split the image.
warped = threshold_adaptive(warped, 250, offset = 10)
warped = warped.astype("uint8") * 255
# get areas where we can split image on whitespace to make OCR more accurate
color_level = np.array([np.sum(line) / len(line) for line in warped])
cuts = []
i = 0
while(i < len(color_level)):
if color_level[i] > 250:
begin = i
while(color_level[i] > 250):
i += 1
cuts.append((i + begin)/2) # middle of the whitespace region
else:
i += 1
EDIT 2: Sample image added
From your input image, you need to make text as white, and background as black
You need then to compute the rotation angle of your bill. A simple approach is to find the minAreaRect of all white points (findNonZero), and you get:
Then you can rotate your bill, so that text is horizontal:
Now you can compute horizontal projection (reduce). You can take the average value in each line. Apply a threshold th on the histogram to account for some noise in the image (here I used 0, i.e. no noise). Lines with only background will have a value >0, text lines will have value 0 in the histogram. Then take the average bin coordinate of each continuous sequence of white bins in the histogram. That will be the y coordinate of your lines:
Here the code. It's in C++, but since most of the work is with OpenCV functions, it should be easy convertible to Python. At least, you can use this as a reference:
#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
// Read image
Mat3b img = imread("path_to_image");
// Binarize image. Text is white, background is black
Mat1b bin;
cvtColor(img, bin, COLOR_BGR2GRAY);
bin = bin < 200;
// Find all white pixels
vector<Point> pts;
findNonZero(bin, pts);
// Get rotated rect of white pixels
RotatedRect box = minAreaRect(pts);
if (box.size.width > box.size.height)
{
swap(box.size.width, box.size.height);
box.angle += 90.f;
}
Point2f vertices[4];
box.points(vertices);
for (int i = 0; i < 4; ++i)
{
line(img, vertices[i], vertices[(i + 1) % 4], Scalar(0, 255, 0));
}
// Rotate the image according to the found angle
Mat1b rotated;
Mat M = getRotationMatrix2D(box.center, box.angle, 1.0);
warpAffine(bin, rotated, M, bin.size());
// Compute horizontal projections
Mat1f horProj;
reduce(rotated, horProj, 1, CV_REDUCE_AVG);
// Remove noise in histogram. White bins identify space lines, black bins identify text lines
float th = 0;
Mat1b hist = horProj <= th;
// Get mean coordinate of white white pixels groups
vector<int> ycoords;
int y = 0;
int count = 0;
bool isSpace = false;
for (int i = 0; i < rotated.rows; ++i)
{
if (!isSpace)
{
if (hist(i))
{
isSpace = true;
count = 1;
y = i;
}
}
else
{
if (!hist(i))
{
isSpace = false;
ycoords.push_back(y / count);
}
else
{
y += i;
count++;
}
}
}
// Draw line as final result
Mat3b result;
cvtColor(rotated, result, COLOR_GRAY2BGR);
for (int i = 0; i < ycoords.size(); ++i)
{
line(result, Point(0, ycoords[i]), Point(result.cols, ycoords[i]), Scalar(0, 255, 0));
}
return 0;
}
Basic steps as #Miki,
read the source
threshed
find minAreaRect
warp by the rotated matrix
find and draw upper and lower bounds
While code in Python:
#!/usr/bin/python3
# 2018.01.16 01:11:49 CST
# 2018.01.16 01:55:01 CST
import cv2
import numpy as np
## (1) read
img = cv2.imread("img02.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
## (2) threshold
th, threshed = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV|cv2.THRESH_OTSU)
## (3) minAreaRect on the nozeros
pts = cv2.findNonZero(threshed)
ret = cv2.minAreaRect(pts)
(cx,cy), (w,h), ang = ret
if w>h:
w,h = h,w
ang += 90
## (4) Find rotated matrix, do rotation
M = cv2.getRotationMatrix2D((cx,cy), ang, 1.0)
rotated = cv2.warpAffine(threshed, M, (img.shape[1], img.shape[0]))
## (5) find and draw the upper and lower boundary of each lines
hist = cv2.reduce(rotated,1, cv2.REDUCE_AVG).reshape(-1)
th = 2
H,W = img.shape[:2]
uppers = [y for y in range(H-1) if hist[y]<=th and hist[y+1]>th]
lowers = [y for y in range(H-1) if hist[y]>th and hist[y+1]<=th]
rotated = cv2.cvtColor(rotated, cv2.COLOR_GRAY2BGR)
for y in uppers:
cv2.line(rotated, (0,y), (W, y), (255,0,0), 1)
for y in lowers:
cv2.line(rotated, (0,y), (W, y), (0,255,0), 1)
cv2.imwrite("result.png", rotated)
Finally result:

how to get Matrix(ROI) in openCV using by line that inclined

I already tried to search about openCV ROI function, but All of them used rectangle roi function.
I want to get roi using by inclined line that get from hough transform function.
My situation is next :
I have multiple vertical lines(little inclined) that output from hough transform function.
i want to get image(Matrix) between vertical lines.
enter image description here
i want to get divided matrix in my image (For example, A image, B image, C image etc.. )
Is there ROI function that used line in openCV?
or
any another method?
I think you need to use contours to define your roi. If it is not a perfect square you can not use the ROI function, because this is always a perfect square (not even a rotated square)
int main()
{
enum hierIdx { H_NEXT = 0, H_PREVIOUS, H_FIRST_CHILD, H_PARENT };
cv::Mat img = cv::imread("example_image.jpg", cv::IMREAD_UNCHANGED);
// convert RGB to gray scale image
cv::Mat imgGrs;
cv::cvtColor(img, imgGrs, cv::COLOR_RGB2GRAY);
// because it was a .jpg the grey values are messed up
// we fix it by thresholding at 128
cv::threshold(imgGrs, imgGrs, 128, 255, cv::THRESH_BINARY);
imgGrs = ~imgGrs;
// now create contours (we need the hierarchy to find the inner shapes)
std::vector<std::vector<cv::Point> > contours;
std::vector<cv::Vec4i> hierarchy;
cv::findContours(imgGrs.clone(), contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE);
//cv::drawContours(img, contours, -1, cv::Scalar(255, 0, 0), 1);
int iLen = (int)hierarchy.size();
int idxChild = -1;
// find first child of master
for (int i = 0; i < iLen; i++){
if (hierarchy[i][H_PARENT] < 0) {
idxChild = hierarchy[i][H_FIRST_CHILD];
break;
}
}
// used for erosion of mask
cv::Mat element = cv::getStructuringElement(cv::MORPH_ELLIPSE, cv::Size(3, 3));
while (idxChild >= 0)
{
// create image to use as mask for section
cv::Mat mask = cv::Mat::zeros(imgGrs.size(), CV_8U);
cv::drawContours(mask, contours, idxChild, cv::Scalar(255), CV_FILLED);
// make masker 1 pixel smaller so we wont see the outer contours
cv::erode(mask, mask, element);
// ok nu we create a singled out part we want
cv::Mat part = imgGrs & mask;
// Crop it to the AOI rectangle
cv::Rect aoi = cv::boundingRect(contours[idxChild]);
part = part(aoi);
// part is now the aoi image you asked for
// proceed to next AOI
idxChild = hierarchy[idxChild][H_NEXT];
}
return 0;
}

Letter inside letter, pattern recognition

I would like to detect this pattern
As you can see it's basically the letter C, inside another, with different orientations. My pattern can have multiple C's inside one another, the one I'm posting with 2 C's is just a sample. I would like to detect how many C's there are, and the orientation of each one. For now I've managed to detect the center of such pattern, basically I've managed to detect the center of the innermost C. Could you please provide me with any ideas about different algorithms I could use?
And here we go! A high level overview of this approach can be described as the sequential execution of the following steps:
Load the input image;
Convert it to grayscale;
Threshold it to generate a binary image;
Use the binary image to find contours;
Fill each area of contours with a different color (so we can extract each letter separately);
Create a mask for each letter found to isolate them in separate images;
Crop the images to the smallest possible size;
Figure out the center of the image;
Figure out the width of the letter's border to identify the exact center of the border;
Scan along the border (in a circular fashion) for discontinuity;
Figure out an approximate angle for the discontinuity, thus identifying the amount of rotation of the letter.
I don't want to get into too much detail since I'm sharing the source code, so feel free to test and change it in any way you like.
Let's start, Winter Is Coming:
#include <iostream>
#include <vector>
#include <cmath>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
cv::RNG rng(12345);
float PI = std::atan(1) * 4;
void isolate_object(const cv::Mat& input, cv::Mat& output)
{
if (input.channels() != 1)
{
std::cout << "isolate_object: !!! input must be grayscale" << std::endl;
return;
}
// Store the set of points in the image before assembling the bounding box
std::vector<cv::Point> points;
cv::Mat_<uchar>::const_iterator it = input.begin<uchar>();
cv::Mat_<uchar>::const_iterator end = input.end<uchar>();
for (; it != end; ++it)
{
if (*it) points.push_back(it.pos());
}
// Compute minimal bounding box
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
// Set Region of Interest to the area defined by the box
cv::Rect roi;
roi.x = box.center.x - (box.size.width / 2);
roi.y = box.center.y - (box.size.height / 2);
roi.width = box.size.width;
roi.height = box.size.height;
// Crop the original image to the defined ROI
output = input(roi);
}
For more details on the implementation of isolate_object() please check this thread. cv::RNG is used later on to fill each contour with a different color, and PI, well... you know PI.
int main(int argc, char* argv[])
{
// Load input (colored, 3-channel, BGR)
cv::Mat input = cv::imread("test.jpg");
if (input.empty())
{
std::cout << "!!! Failed imread() #1" << std::endl;
return -1;
}
// Convert colored image to grayscale
cv::Mat gray;
cv::cvtColor(input, gray, CV_BGR2GRAY);
// Execute a threshold operation to get a binary image from the grayscale
cv::Mat binary;
cv::threshold(gray, binary, 128, 255, cv::THRESH_BINARY);
The binary image looks exactly like the input because it only had 2 colors (B&W):
// Find the contours of the C's in the thresholded image
std::vector<std::vector<cv::Point> > contours;
cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Fill the contours found with unique colors to isolate them later
cv::Mat colored_contours = input.clone();
std::vector<cv::Scalar> fill_colors;
for (size_t i = 0; i < contours.size(); i++)
{
std::vector<cv::Point> cnt = contours[i];
double area = cv::contourArea(cv::Mat(cnt));
//std::cout << "* Area: " << area << std::endl;
// Fill each C found with a different color.
// If the area is larger than 100k it's probably the white background, so we ignore it.
if (area > 10000 && area < 100000)
{
cv::Scalar color = cv::Scalar(rng.uniform(0, 255), rng.uniform(0,255), rng.uniform(0,255));
cv::drawContours(colored_contours, contours, i, color,
CV_FILLED, 8, std::vector<cv::Vec4i>(), 0, cv::Point());
fill_colors.push_back(color);
//cv::imwrite("test_contours.jpg", colored_contours);
}
}
What colored_contours looks like:
// Create a mask for each C found to isolate them from each other
for (int i = 0; i < fill_colors.size(); i++)
{
// After inRange() single_color_mask stores a single C letter
cv::Mat single_color_mask = cv::Mat::zeros(input.size(), CV_8UC1);
cv::inRange(colored_contours, fill_colors[i], fill_colors[i], single_color_mask);
//cv::imwrite("test_mask.jpg", single_color_mask);
Since this for loop is executed twice, one for each color that was used to fill the contours, I want you to see all images that were generated by this stage. So the following images are the ones that were stored by single_color_mask (one for each iteration of the loop):
// Crop image to the area of the object
cv::Mat cropped;
isolate_object(single_color_mask, cropped);
//cv::imwrite("test_cropped.jpg", cropped);
cv::Mat orig_cropped = cropped.clone();
These are the ones that were stored by cropped (by the way, the smaller C looks fat because the image is rescaled by this page to have the same size of the larger C, don't worry):
// Figure out the center of the image
cv::Point obj_center(cropped.cols/2, cropped.rows/2);
//cv::circle(cropped, obj_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_cropped_center.jpg", cropped);
To make it clearer to understand for what obj_center is for, I painted a little gray circle for educational purposes on that location:
// Figure out the exact center location of the border
std::vector<cv::Point> border_points;
for (int y = 0; y < cropped.cols; y++)
{
if (cropped.at<uchar>(obj_center.x, y) != 0)
border_points.push_back(cv::Point(obj_center.x, y));
if (border_points.size() > 0 && cropped.at<uchar>(obj_center.x, y) == 0)
break;
}
if (border_points.size() == 0)
{
std::cout << "!!! Oops! No border detected." << std::endl;
return 0;
}
// Figure out the exact center location of the border
cv::Point border_center = border_points[border_points.size() / 2];
//cv::circle(cropped, border_center, 3, cv::Scalar(128, 128, 128));
//cv::imwrite("test_border_center.jpg", cropped);
The procedure above scans a single vertical line from top/middle of the image to find the borders of the circle to be able to calculate it's width. Again, for education purposes I painted a small gray circle in the middle of the border. This is what cropped looks like:
// Scan the border of the circle for discontinuities
int radius = obj_center.y - border_center.y;
if (radius < 0)
radius *= -1;
std::vector<cv::Point> discontinuity_points;
std::vector<int> discontinuity_angles;
for (int angle = 0; angle <= 360; angle++)
{
int x = obj_center.x + (radius * cos((angle+90) * (PI / 180.f)));
int y = obj_center.y + (radius * sin((angle+90) * (PI / 180.f)));
if (cropped.at<uchar>(x, y) < 128)
{
discontinuity_points.push_back(cv::Point(y, x));
discontinuity_angles.push_back(angle);
//cv::circle(cropped, cv::Point(y, x), 1, cv::Scalar(128, 128, 128));
}
}
//std::cout << "Discontinuity size: " << discontinuity_points.size() << std::endl;
if (discontinuity_points.size() == 0 && discontinuity_angles.size() == 0)
{
std::cout << "!!! Oops! No discontinuity detected. It's a perfect circle, dang!" << std::endl;
return 0;
}
Great, so the piece of code above scans along the middle of the circle's border looking for discontinuity. I'm sharing a sample image to illustrate what I mean. Every gray dot on the image represents a pixel that is tested. When the pixel is black it means we found a discontinuity:
// Figure out the approximate angle of the discontinuity:
// the first angle found will suffice for this demo.
int approx_angle = discontinuity_angles[0];
std::cout << "#" << i << " letter C is rotated approximately at: " << approx_angle << " degrees" << std::endl;
// Figure out the central point of the discontinuity
cv::Point discontinuity_center;
for (int a = 0; a < discontinuity_points.size(); a++)
discontinuity_center += discontinuity_points[a];
discontinuity_center.x /= discontinuity_points.size();
discontinuity_center.y /= discontinuity_points.size();
cv::circle(orig_cropped, discontinuity_center, 2, cv::Scalar(128, 128, 128));
cv::imshow("Original crop", orig_cropped);
cv::waitKey(0);
}
return 0;
}
Very well... This last piece of code is responsible for figuring out the approximate angle of the discontinuity as well as indicate the central point of discontinuity. The following images are stored by orig_cropped. Once again I added a gray dot to show the exact positions detected as the center of the gaps:
When executed, this application prints the following information to the screen:
#0 letter C is rotated approximately at: 49 degrees
#1 letter C is rotated approximately at: 0 degrees
I hope it helps.
For start you could use Hough transformation. This algorithm is not very fast, but it's quite robust. Especially if you have such clear images.
The general approach would be:
1) preprocessing - suppress noise, convert to grayscale / binary
2) run edge detector
3) run Hough transform - IIRC it's `cv::HoughCircles` in OpenCV
4) do some postprocessing - remove surplus circles, decide which ones correspond to shape of letter C, and so on
My approach will give you 2 hough circles per letter C. One on inner boundary, one on outer letter C. If you want only one circle per letter you can use skeletonization algoritm. More info here http://homepages.inf.ed.ac.uk/rbf/HIPR2/skeleton.htm
Given that we have nested C structures and you know the centres of the Cs and would like to evaluate the orientations- one simply needs to observe the distribution of pixels along the radius of the concentric Cs in all directions.
This can be done by performing a simple morphological dilation operation from the centre. As we reach the right radius for the innermost C, we will reach a maximum number of pixels covered for the innermost C. The difference between the disc and the C will give us the location of the gap in the whole and one can perform an ultimate erosion to get the centroid of the gap in the C. The angle between the centre and this point is the orientation of the C. This step is iterated till all Cs are covered.
This can also be done quickly using the Distance function from the centre point of the Cs.

Detect semicircle in OpenCV

I am trying to detect full circles and semicircles in an image.
I am following the below mentioned procedure:
Process image (including Canny edge detection).
Find contours and draw them on an empty image, so that I can eliminate unwanted components
(The processed image is exactly what I want).
Detect circles using HoughCircles. And, this is what I get:
I tried varying the parameters in HoughCircles but the results are not consistent as it varies based on lighting and the position of circles in the image.
I accept or reject a circle based on its size. So, the result is not acceptable. Also, I have a long list of "acceptable" circles. So, I need some allowance in the HoughCircle params.
As for the full circles, it's easy - I can simply find the "roundness" of the contour. The problem is semicircles!
Please find the edited image before Hough transform
Use houghCircle directly on your image, don't extract edges first.
Then test for each detected circle, how much percentage is really present in the image:
int main()
{
cv::Mat color = cv::imread("../houghCircles.png");
cv::namedWindow("input"); cv::imshow("input", color);
cv::Mat canny;
cv::Mat gray;
/// Convert it to gray
cv::cvtColor( color, gray, CV_BGR2GRAY );
// compute canny (don't blur with that image quality!!)
cv::Canny(gray, canny, 200,20);
cv::namedWindow("canny2"); cv::imshow("canny2", canny>0);
std::vector<cv::Vec3f> circles;
/// Apply the Hough Transform to find the circles
cv::HoughCircles( gray, circles, CV_HOUGH_GRADIENT, 1, 60, 200, 20, 0, 0 );
/// Draw the circles detected
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
cv::circle( color, center, 3, Scalar(0,255,255), -1);
cv::circle( color, center, radius, Scalar(0,0,255), 1 );
}
//compute distance transform:
cv::Mat dt;
cv::distanceTransform(255-(canny>0), dt, CV_DIST_L2 ,3);
cv::namedWindow("distance transform"); cv::imshow("distance transform", dt/255.0f);
// test for semi-circles:
float minInlierDist = 2.0f;
for( size_t i = 0; i < circles.size(); i++ )
{
// test inlier percentage:
// sample the circle and check for distance to the next edge
unsigned int counter = 0;
unsigned int inlier = 0;
cv::Point2f center((circles[i][0]), (circles[i][1]));
float radius = (circles[i][2]);
// maximal distance of inlier might depend on the size of the circle
float maxInlierDist = radius/25.0f;
if(maxInlierDist<minInlierDist) maxInlierDist = minInlierDist;
//TODO: maybe paramter incrementation might depend on circle size!
for(float t =0; t<2*3.14159265359f; t+= 0.1f)
{
counter++;
float cX = radius*cos(t) + circles[i][0];
float cY = radius*sin(t) + circles[i][1];
if(dt.at<float>(cY,cX) < maxInlierDist)
{
inlier++;
cv::circle(color, cv::Point2i(cX,cY),3, cv::Scalar(0,255,0));
}
else
cv::circle(color, cv::Point2i(cX,cY),3, cv::Scalar(255,0,0));
}
std::cout << 100.0f*(float)inlier/(float)counter << " % of a circle with radius " << radius << " detected" << std::endl;
}
cv::namedWindow("output"); cv::imshow("output", color);
cv::imwrite("houghLinesComputed.png", color);
cv::waitKey(-1);
return 0;
}
For this input:
It gives this output:
The red circles are Hough results.
The green sampled dots on the circle are inliers.
The blue dots are outliers.
Console output:
100 % of a circle with radius 27.5045 detected
100 % of a circle with radius 25.3476 detected
58.7302 % of a circle with radius 194.639 detected
50.7937 % of a circle with radius 23.1625 detected
79.3651 % of a circle with radius 7.64853 detected
If you want to test RANSAC instead of Hough, have a look at this.
Here is another way to do it, a simple RANSAC version (much optimization to be done to improve speed), that works on the Edge Image.
the method loops these steps until it is cancelled
choose randomly 3 edge pixel
estimate circle from them (3 points are enough to identify a circle)
verify or falsify that it's really a circle: count how much percentage of the circle is represented by the given edges
if a circle is verified, remove the circle from input/egdes
int main()
{
//RANSAC
//load edge image
cv::Mat color = cv::imread("../circleDetectionEdges.png");
// convert to grayscale
cv::Mat gray;
cv::cvtColor(color, gray, CV_RGB2GRAY);
// get binary image
cv::Mat mask = gray > 0;
//erode the edges to obtain sharp/thin edges (undo the blur?)
cv::erode(mask, mask, cv::Mat());
std::vector<cv::Point2f> edgePositions;
edgePositions = getPointPositions(mask);
// create distance transform to efficiently evaluate distance to nearest edge
cv::Mat dt;
cv::distanceTransform(255-mask, dt,CV_DIST_L1, 3);
//TODO: maybe seed random variable for real random numbers.
unsigned int nIterations = 0;
char quitKey = 'q';
std::cout << "press " << quitKey << " to stop" << std::endl;
while(cv::waitKey(-1) != quitKey)
{
//RANSAC: randomly choose 3 point and create a circle:
//TODO: choose randomly but more intelligent,
//so that it is more likely to choose three points of a circle.
//For example if there are many small circles, it is unlikely to randomly choose 3 points of the same circle.
unsigned int idx1 = rand()%edgePositions.size();
unsigned int idx2 = rand()%edgePositions.size();
unsigned int idx3 = rand()%edgePositions.size();
// we need 3 different samples:
if(idx1 == idx2) continue;
if(idx1 == idx3) continue;
if(idx3 == idx2) continue;
// create circle from 3 points:
cv::Point2f center; float radius;
getCircle(edgePositions[idx1],edgePositions[idx2],edgePositions[idx3],center,radius);
float minCirclePercentage = 0.4f;
// inlier set unused at the moment but could be used to approximate a (more robust) circle from alle inlier
std::vector<cv::Point2f> inlierSet;
//verify or falsify the circle by inlier counting:
float cPerc = verifyCircle(dt,center,radius, inlierSet);
if(cPerc >= minCirclePercentage)
{
std::cout << "accepted circle with " << cPerc*100.0f << " % inlier" << std::endl;
// first step would be to approximate the circle iteratively from ALL INLIER to obtain a better circle center
// but that's a TODO
std::cout << "circle: " << "center: " << center << " radius: " << radius << std::endl;
cv::circle(color, center,radius, cv::Scalar(255,255,0),1);
// accept circle => remove it from the edge list
cv::circle(mask,center,radius,cv::Scalar(0),10);
//update edge positions and distance transform
edgePositions = getPointPositions(mask);
cv::distanceTransform(255-mask, dt,CV_DIST_L1, 3);
}
cv::Mat tmp;
mask.copyTo(tmp);
// prevent cases where no fircle could be extracted (because three points collinear or sth.)
// filter NaN values
if((center.x == center.x)&&(center.y == center.y)&&(radius == radius))
{
cv::circle(tmp,center,radius,cv::Scalar(255));
}
else
{
std::cout << "circle illegal" << std::endl;
}
++nIterations;
cv::namedWindow("RANSAC"); cv::imshow("RANSAC", tmp);
}
std::cout << nIterations << " iterations performed" << std::endl;
cv::namedWindow("edges"); cv::imshow("edges", mask);
cv::namedWindow("color"); cv::imshow("color", color);
cv::imwrite("detectedCircles.png", color);
cv::waitKey(-1);
return 0;
}
float verifyCircle(cv::Mat dt, cv::Point2f center, float radius, std::vector<cv::Point2f> & inlierSet)
{
unsigned int counter = 0;
unsigned int inlier = 0;
float minInlierDist = 2.0f;
float maxInlierDistMax = 100.0f;
float maxInlierDist = radius/25.0f;
if(maxInlierDist<minInlierDist) maxInlierDist = minInlierDist;
if(maxInlierDist>maxInlierDistMax) maxInlierDist = maxInlierDistMax;
// choose samples along the circle and count inlier percentage
for(float t =0; t<2*3.14159265359f; t+= 0.05f)
{
counter++;
float cX = radius*cos(t) + center.x;
float cY = radius*sin(t) + center.y;
if(cX < dt.cols)
if(cX >= 0)
if(cY < dt.rows)
if(cY >= 0)
if(dt.at<float>(cY,cX) < maxInlierDist)
{
inlier++;
inlierSet.push_back(cv::Point2f(cX,cY));
}
}
return (float)inlier/float(counter);
}
inline void getCircle(cv::Point2f& p1,cv::Point2f& p2,cv::Point2f& p3, cv::Point2f& center, float& radius)
{
float x1 = p1.x;
float x2 = p2.x;
float x3 = p3.x;
float y1 = p1.y;
float y2 = p2.y;
float y3 = p3.y;
// PLEASE CHECK FOR TYPOS IN THE FORMULA :)
center.x = (x1*x1+y1*y1)*(y2-y3) + (x2*x2+y2*y2)*(y3-y1) + (x3*x3+y3*y3)*(y1-y2);
center.x /= ( 2*(x1*(y2-y3) - y1*(x2-x3) + x2*y3 - x3*y2) );
center.y = (x1*x1 + y1*y1)*(x3-x2) + (x2*x2+y2*y2)*(x1-x3) + (x3*x3 + y3*y3)*(x2-x1);
center.y /= ( 2*(x1*(y2-y3) - y1*(x2-x3) + x2*y3 - x3*y2) );
radius = sqrt((center.x-x1)*(center.x-x1) + (center.y-y1)*(center.y-y1));
}
std::vector<cv::Point2f> getPointPositions(cv::Mat binaryImage)
{
std::vector<cv::Point2f> pointPositions;
for(unsigned int y=0; y<binaryImage.rows; ++y)
{
//unsigned char* rowPtr = binaryImage.ptr<unsigned char>(y);
for(unsigned int x=0; x<binaryImage.cols; ++x)
{
//if(rowPtr[x] > 0) pointPositions.push_back(cv::Point2i(x,y));
if(binaryImage.at<unsigned char>(y,x) > 0) pointPositions.push_back(cv::Point2f(x,y));
}
}
return pointPositions;
}
input:
output:
console output:
press q to stop
accepted circle with 50 % inlier
circle: center: [358.511, 211.163] radius: 193.849
accepted circle with 85.7143 % inlier
circle: center: [45.2273, 171.591] radius: 24.6215
accepted circle with 100 % inlier
circle: center: [257.066, 197.066] radius: 27.819
circle illegal
30 iterations performed`
optimization should include:
use all inlier to fit a better circle
dont compute distance transform after each detected circles (it's quite expensive). compute inlier from point/edge set directly and remove the inlier edges from that list.
if there are many small circles in the image (and/or a lot of noise), it's unlikely to hit randomly 3 edge pixels or a circle. => try contour detection first and detect circles for each contour. after that try to detect all "other" circles left in the image.
a lot of other stuff
#Micka's answer is great, here it is roughly translated into python
The method takes a thresholded image mask as an argument, leaving that part as an exercise for the reader
def get_circle_percentages(image):
#compute distance transform of image
dist = cv2.distanceTransform(image, cv2.DIST_L2, 0)
rows = image.shape[0]
circles = cv2.HoughCircles(image, cv2.HOUGH_GRADIENT, 1, rows / 8, 50, param1=50, param2=10, minRadius=40, maxRadius=90)
minInlierDist = 2.0
for c in circles[0, :]:
counter = 0
inlier = 0
center = (c[0], c[1])
radius = c[2]
maxInlierDist = radius/25.0
if maxInlierDist < minInlierDist: maxInlierDist = minInlierDist
for i in np.arange(0, 2*np.pi, 0.1):
counter += 1
x = center[0] + radius * np.cos(i)
y = center[1] + radius * np.sin(i)
if dist.item(int(y), int(x)) < maxInlierDist:
inlier += 1
print(str(100.0*inlier/counter) + ' percent of a circle with radius ' + str(radius) + " detected")
I know that it's little bit late, but I used different approach which is much easier.
From the cv2.HoughCircles(...) you get centre of the circle and the diameter (x,y,r). So I simply go through all centre points of the circles and I check if they are further away from the edge of the image than their diameter.
Here is my code:
height, width = img.shape[:2]
#test top edge
up = (circles[0, :, 0] - circles[0, :, 2]) >= 0
#test left edge
left = (circles[0, :, 1] - circles[0, :, 2]) >= 0
#test right edge
right = (circles[0, :, 0] + circles[0, :, 2]) <= width
#test bottom edge
down = (circles[0, :, 1] + circles[0, :, 2]) <= height
circles = circles[:, (up & down & right & left), :]
The semicircle detected by the hough algorithm is most probably correct. The issue here might be that unless you strictly control the geometry of the scene, i.e. exact position of the camera relative to the target, so that the image axis is normal to the target plane, you will get ellipsis rather than circles projected on the image plane. Not to mention the distortions caused by the optical system, which further degenerate the geometric figure. If you rely on precision here, I would recommend camera calibration.
You better try with different kernel for gaussian blur.That will help you
GaussianBlur( src_gray, src_gray, Size(11, 11), 5,5);
so change size(i,i),j,j)

Resources