How to identify different objects in an image?

How to identify different objects in an image? - image-processing

I'm intending to write a program to detect and differentiate certain objects from a nearly solid background. The foreground and the background have a high contrast difference which I would further increase to aid in the object identification process. I'm planning to use Hough transform technique and OpenCV.
Sample image
As seen in the above image, I would want to separately identify the circular objects and the square objects (or any other shape out of a finite set of shapes). Since I'm quite new to image processing I do not have an idea whether such a situation needs a neural network to be implemented and each shape to be learned beforehand. Would a technique such as template matching let me do this without a neural network?

These posts will get you started:
How to detect circles
How to detect squares
How to detect a sheet of paper (advanced square detection)
You will probably have to adjust some parameters in these codes to match your circles/squares, but the core of the technique is shown on these examples.

If you intend to detect shapes other than just circles, (and from the image I assume you do), I would recommend the Chamfer matching for a quick start, especially as you have a good contrast.
The basic premise, explained in simple terms, is following:
You do an edge detection (for example, cvCanny in opencv)
You create a distance image, where the value of each pixel means the distance fom the nearest edge.
You take the shapes you would like to detect, define sample points along the edges of the shape, and try to match these points on the distance image. Basically you just add the values on the distance image which are "under" the coordinates of your sample points, given a specific position of your objects.
Find a good minimization algorithm, the effectiveness of this depends on your application.
This basic approach is a general solution, usually works well, but without further advancements, it is very slow.
Usually it's a good idea to first separate the objects of interest, so you don't have to always do the full search on the whole image. Find a good threshold, so you can separate objects. You still don't know which object it is, but you only have to do the matching itself in close proximity of this object.
Another good idea is, instead of doing the full search on the high resolution image, first do it on a very low resolution. The result will not be very accurate, but you can know the general areas where it's worth to do a search on a higher resolution, so you don't waste your time on areas where there is nothing of interest.
There are a number of more advanced techniques, but it's still worth to take a look at the basic chamfer matching, as it is the base of a large number of techniques.

With the assumption that the objects are simple shapes, here's an approach using thresholding + contour approximation. Contour approximation is based on the assumption that a curve can be approximated by a series of short line segments which can be used to determine the shape of a contour. For instance, a triangle has three vertices, a square/rectangle has four vertices, a pentagon has five vertices, and so on.
Obtain binary image. We load the image, convert to grayscale, Gaussian blur, then adaptive threshold to obtain a binary image.
Detect shapes. Find contours and identify the shape of each contour using contour approximation filtering. This can be done using arcLength to compute the perimeter of the contour and approxPolyDP to obtain the actual contour approximation.
Input image
Detected objects highlighted in green
Labeled contours
Code
import cv2
def detect_shape(c):
# Compute perimeter of contour and perform contour approximation
shape = ""
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.04 * peri, True)
# Triangle
if len(approx) == 3:
shape = "triangle"
# Square or rectangle
elif len(approx) == 4:
(x, y, w, h) = cv2.boundingRect(approx)
ar = w / float(h)
# A square will have an aspect ratio that is approximately
# equal to one, otherwise, the shape is a rectangle
shape = "square" if ar >= 0.95 and ar <= 1.05 else "rectangle"
# Star
elif len(approx) == 10:
shape = "star"
# Otherwise assume as circle or oval
else:
shape = "circle"
return shape
# Load image, grayscale, Gaussian blur, and adaptive threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,31,3)
# Find contours and detect shape
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Identify shape
shape = detect_shape(c)
# Find centroid and label shape name
M = cv2.moments(c)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
cv2.putText(image, shape, (cX - 20, cY), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()

Related

Identifying imperfect shapes with noisy backgrounds with OpenCV

I am trying to identify a rectangle underwater in a noisy environment. I implemented Canny to find the edges, and drew the found edges using cv2.circle. From here, I am trying to identify the imperfect rectangle in the image (the black one below the long rectangle that covers the top of the frame)
I have attempted multiple solutions, including thresholds, blurs and resizing the image to detect the rectangle. Below is the barebones code with just drawing the identified edges.
import numpy as np
import cv2
import imutils
img_text = 'img5.png'
img = cv2.imread(img_text)
original = img.copy()
min_value = 50
max_value = 100
# draw image and return coordinates of drawn pixels
image = cv2.Canny(img, min_value, max_value)
indices = np.where(image != 0)
coordinates = zip(indices[1], indices[0])
for point in coordinates:
cv2.circle(original, point, 1, (0, 0, 255), -1)
cv2.imshow('original', original)
cv2.waitKey(0)
cv2.destroyAllWindows()
Where the output displays this:
output
From here I want to be able to separately detect just the rectangle and draw another rectangle on top of the output in green, but I haven't been able to find a way to detect the original rectangle on its own.

For your specific image, I obtained quite good results with a simple thresholding on the blue channel.
image = cv2.imread("test.png")
t, img = cv2.threshold(image[:,:,0], 80, 255, cv2.THRESH_BINARY)
In order to adapt the threshold, I propose a simple way of varying the threshold until you get one component. I have also implemented the rectangle drawing:
def find_square(image):
markers = 0
threshold = 10
while np.amax(markers) == 0:
threshold += 5
t, img = cv2.threshold(image[:,:,0], threshold, 255, cv2.THRESH_BINARY_INV)
_, markers = cv2.connectedComponents(img)
kernel = np.ones((5,5),np.uint8)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
img = cv2.morphologyEx(img, cv2.MORPH_DILATE, kernel)
nonzero = cv2.findNonZero(img)
x, y, w, h = cv2.boundingRect(nonzero)
cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow("image", image)
And the results on the provided example images:
The idea behind this approach is based on the observation that the most information is in the blue channel. If you separate the images in the channels, you will see that in the blue channel, the dark square has the best contrast. It is also the darkest region on this channel, which is why thresholding works. The problem remains the threshold setting. Based on the above intuition, we are looking for the lowest threshold that will bring up something (and hope that it will be the square). What I did is to simply increase gradually the threshold until something appears.
Then, I applied some morphology operations to eliminate other small points that may appear after thresholding and to make the square look a bit bigger (the edges of the square are lighter, and therefore not the entire square is captured). Then is was a matter of drawing the rectangle.
The code can be made much nicer (and more efficient) by doing some statistical analysis on the histogram. Simply compute the threshold such that 5% (or some percent) of the pixels are darker. You may require do so a connected component analysis to keep the biggest blob.
Also, my usage of connectedComponents is very poor and inefficient. Again, code written in a hurry to prove the concept.

Billboard corner detection

I was trying to detect billboard images on a random background. I was able to localize the billboard using SSD, this give me approximate bounding box around the billboard. Now I want to find the exact corners of the billboard for my application. I tried using different strategies which I came across such as Harris corner detection (using Opencv), finding intersections of lines using, Canny + morphological operations + contours. The details on the output is given below.
Harris corner detection
The pseudocode for the harris corner detection is as follows:
img_patch_gray = np.float32(img_patch_gray)
harris_point = cv2.cornerHarris(img_patch_gray,2,3,0.04)
img_patch[harris_point>0.01*harris_point.max()]=[255,0,0]
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(img_patch)
Here the red dots are the corners detected by the Harris corner detection algorithm and the points of interest are encircled in green.
Using Hough line detection
Here I was trying to find the intersection of the lines and then choosing the points. Something similar to stackoverflow link, but it is very difficult to get the exact lines since billboards have text and graphics in it.
Contour based
In this approach I have used canny edge detector, followed by dilation(3*3 kernel), followed by contour.
bin_img = cv2.Canny(gray_img_patch,100,250)
bin_img = dilate(bin_img, 3)
plt.imshow(bin_img, cmap='gray')
(_,cnts, _) = cv2.findContours(bin_img.copy(),
cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:10]
cv2.drawContours(img_patch, [cnts[0]],0, (0,255,0), 1)
, . I had tried using approxPolyDp function from openCV but it was not as expected since it can also approximate larger or smaller contours by four points and in some images it might not form contours around the billboard frame.
I have used openCV 3.4 for all the image processing operations. used can be found here. Please note that the image discussed here is just for the illustration purpose and in general image can be of any billboard.
Thanks in advance, any help is appreciated.

This is a very difficult task because the image containes a lot of noise. You can get an approximation of the contour but specific corners would be very hard. I have made an example on how I would make an approximation. It may not work on other images. Maybe it will help a bit or give you a new idea. Cheers!
import cv2
import numpy as np
# Read the image
img = cv2.imread('billboard.png')
# Blur the image with a big kernel and then transform to gray colorspace
blur = cv2.GaussianBlur(img,(19,19),0)
gray = cv2.cvtColor(blur,cv2.COLOR_BGR2GRAY)
# Perform histogram equalization on the blur and then perform Otsu threshold
equ = cv2.equalizeHist(gray)
_, thresh = cv2.threshold(equ,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Perform opening on threshold with a big kernel (erosion followed by dilation)
kernel = np.ones((20,20),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
# Search for contours and select the biggest one
_, contours, hierarchy = cv2.findContours(opening,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
# Make a hull arround the contour and draw it on the original image
mask = np.zeros((img.shape[:2]), np.uint8)
hull = cv2.convexHull(cnt)
cv2.drawContours(mask, [hull], 0, (255,255,255),-1)
# Search for contours and select the biggest one again
_, thresh = cv2.threshold(mask,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
_, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
# Draw approxPolyDP on the image
epsilon = 0.008*cv2.arcLength(cnt,True)
approx = cv2.approxPolyDP(cnt,epsilon,True)
cv2.drawContours(img, [cnt], 0, (0,255,0), 5)
# Display the image
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:

detecting blurred rectangles on multimodal grayscaleimage

I have the following image
I'm trying to find the pixel coordinates of main rectangles (those between white lines). I tried few things but I can't obtain good enough solution. The solution doesn't have to be perfect and it's ok if not all rectangles are detected (especially those really small ones). Though corners location will have to be as much exact as possible, especially with those bigger blurry (I'm trying to write some simple AR engine).
I can clarify there are only 4 levels of grayscale: 0, 110, 180 and 255 (when printing, no screen it will vary because of lightning and shadows)
So far I tried few things:
manual multilevel thresholding (because of shadows and different lightning it didn't work)
adaptive thresholding : 2 problems:
it combines 180 and 255 colors into white, and 0, 110 into black
edge/corner location of blurred(bigger) rectangles is not exact (it adds blur to rectangle area)
sobel edge detection (corners of blurred rectangles are more sharp, but it detects also inner edges in rectangles, also those edge contours are not always closed
Looks like combining those two thinks somehow would yield better results. Or maybe somebody have different idea?
I was also thinking about doing floodfill, but it was hard to find for sure good seed point and threshold automatically (there might be some other white objects in the background). Besides I will want to optimize later this for GPU and floodfill algorithm is rather not a good fit for this.
Below is some sample code I tried so far:
image = cv2.imread('data/image.jpg');
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
cv2.imshow('image', gray)
adaptive = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 601, 0)
cv2.imshow('adaptive', adaptive)
gradx = cv2.Sobel(gray, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=3)
grady = cv2.Sobel(gray, ddepth=cv2.CV_32F, dx=0, dy=1, ksize=3)
abs_gradx = cv2.convertScaleAbs(grady)
abs_grady = cv2.convertScaleAbs(grady)
grad = cv2.addWeighted(abs_gradx, 0.5, abs_grady, 0.5, 0)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
grad = cv2.morphologyEx(grad, cv2.MORPH_OPEN, kernel)
grad = cv2.morphologyEx(grad, cv2.MORPH_CLOSE, kernel)
cv2.imshow('sobel',grad)
#kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(7,7))
#grad = cv2.morphologyEx(grad, cv2.MORPH_OPEN, kernel)
retval, grad = cv2.threshold(grad, 10, 255, cv2.THRESH_BINARY)
cv2.imshow('sobel+morph+thrs',grad)
cv2.waitKey()

I believe your answer lies in using the Hough transform to detect lines, extending these lines to span the breaks between the darker squares, and then looking for intersects marking out corners. I've had a quick play in Matlab and have come up with the following, it's not perfect but should show the potential:
% Open image
i = imread('http://i.stack.imgur.com/kwcXm.jpg');
% Use a sharpening filter to enhance some of the edges
H = fspecial('unsharp');
i = imfilter(i, H, 'replicate');
% Detect edge segments using canny
BW = edge(i, 'canny');
% Apply hough transform to edges
[H, T, R] = hough(BW, 'RhoResolution', 0.5, 'Theta', -90:0.5:89.5);
% Find peaks in hough transform
P = houghpeaks(H, 5, 'threshold', ceil(0.1*max(H(:))));
% Extract lines from peaks, extending partial lines
lines = houghlines(BW, T, R, P, 'FillGap', 100, 'MinLength', 5);
% Plot detected lines on image
imshow(i); hold on;
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
end
With the final result:
Obviously there's room for improvement, with a number of lines still to detect, but if tweaking the various parameters doesn't work you could take the initial result and the search for more lines with similar angles to obtain a more complete set. The corners can then be found from the intersects which should be simple enough to extract.

I would try the following:
Hough transform to detect all straight lines
Look for sets of parallel lines that are:
A sufficient distance apart
Separated by the lighter color
There are a couple of things that make your problem trickier than it needs to be:
Perspective distortion
Changes in lighting, minor shadows
If you could minimize the above, it may help solving the problem.

Finding location of rectangles in an image with OpenCV

I'm trying to use OpenCV to "parse" screenshots from the iPhone game Blocked. The screenshots are cropped to look like this:
I suppose for right now I'm just trying to find the coordinates of each of the 4 points that make up each rectangle. I did see the sample file squares.c that comes with OpenCV, but when I run that algorithm on this picture, it comes up with 72 rectangles, including the rectangular areas of whitespace that I obviously don't want to count as one of my rectangles. What is a better way to approach this? I tried doing some Google research, but for all of the search results, there is very little relevant usable information.

The similar issue has already been discussed:
How to recognize rectangles in this image?
As for your data, rectangles you are trying to find are the only black objects. So you can try to do a threshold binarization: black pixels are those ones which have ALL three RGB values less than 40 (I've found it empirically). This simple operation makes your picture look like this:
After that you could apply Hough transform to find lines (discussed in the topic I referred to), or you can do it easier. Compute integral projections of the black pixels to X and Y axes. (The projection to X is a vector of x_i - numbers of black pixels such that it has the first coordinate equal to x_i). So, you get possible x and y values as the peaks of the projections. Then look through all the possible segments restricted by the found x and y (if there are a lot of black pixels between (x_i, y_j) and (x_i, y_k), there probably is a line probably). Finally, compose line segments to rectangles!

Here's a complete Python solution. The main idea is:
Apply pyramid mean shift filtering to help threshold accuracy
Otsu's threshold to get a binary image
Find contours and filter using contour approximation
Here's a visualization of each detected rectangle contour
Results
import cv2
image = cv2.imread('1.png')
blur = cv2.pyrMeanShiftFiltering(image, 11, 21)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
x,y,w,h = cv2.boundingRect(approx)
cv2.rectangle(image,(x,y),(x+w,y+h),(36,255,12),2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()

I wound up just building on my original method and doing as Robert suggested in his comment on my question. After I get my list of rectangles, I then run through and calculate the average color over each rectangle. I check to see if the red, green, and blue components of the average color are each within 10% of the gray and blue rectangle colors, and if they are I save the rectangle, if they aren't I discard it. This process gives me something like this:
From this, it's trivial to get the information I need (orientation, starting point, and length of each rectangle, considering the game window as a 6x6 grid).

The blocks look like bitmaps - why don't you use simple template matching with different templates for each block size/color/orientation?

Since your problem is the small rectangles I would start by removing them.
Since those lines are much thinner than the borders of the rectangles I would start by applying morphological operations on the image.
Using a structural element that looks like this:
element = [ 1 1
1 1 ]
should remove lines that are less than two pixels wide. After the small lines are removed the rectangle finding algorithm of OpenCV will most likely do the rest of the job for you.
The erosion can be done in OpenCV by the function cvErode

Try one of the many corner detectors like harris corner detector. also it is in general a good idea to try that at multiple resolutions : so do some preprocessing of of varying magnification.
It appears that you want some sort of color dominated square then you can suppress the other colors, by first using something like cvsplit .....and then thresholding the color...so only that region remains....follow that with a cropping operation ...I think that could work as well ....

OpenCV Object Detection - Center Point

Given an object on a plain white background, does anybody know if OpenCV provides functionality to easily detect an object from a captured frame?
I'm trying to locate the corner/center points of an object (rectangle). The way I'm currently doing it, is by brute force (scanning the image for the object) and not accurate. I'm wondering if there is functionality under the hood that i'm not aware of.
Edit Details:
The size about the same as a small soda can. The camera is positioned above the object, to give it a 2D/Rectangle feel. The orientation/angle from from the camera is random, which is calculated from the corner points.
It's just a white background, with the object on it (black). The quality of the shot is about what you'd expect to see from a Logitech webcam.
Once I get the corner points, I calculate the center. The center point is then converted to centimeters.
It's refining just 'how' I get those 4 corners is what I'm trying to focus on. You can see my brute force method with this image: Image

There's already an example of how to do rectangle detection in OpenCV (look in samples/squares.c), and it's quite simple, actually.
Here's the rough algorithm they use:
0. rectangles <- {}
1. image <- load image
2. for every channel:
2.1 image_canny <- apply canny edge detector to this channel
2.2 for threshold in bunch_of_increasing_thresholds:
2.2.1 image_thresholds[threshold] <- apply threshold to this channel
2.3 for each contour found in {image_canny} U image_thresholds:
2.3.1 Approximate contour with polygons
2.3.2 if the approximation has four corners and the angles are close to 90 degrees.
2.3.2.1 rectangles <- rectangles U {contour}
Not an exact transliteration of what they are doing, but it should help you.

Hope this helps, uses the moment method to get the centroid of a black and white image.
cv::Point getCentroid(cv::Mat img)
{
cv::Point Coord;
cv::Moments mm = cv::moments(img,false);
double moment10 = mm.m10;
double moment01 = mm.m01;
double moment00 = mm.m00;
Coord.x = int(moment10 / moment00);
Coord.y = int(moment01 / moment00);
return Coord;
}

OpenCV has heaps of functions that can help you achieve this. Download Emgu.CV for a C#.NET wrapped to the library if you are programming in that language.
Some methods of getting what you want:
Find the corners as before - e.g. "CornerHarris" OpenCV function
Threshold the image and calculate the centre of gravity - see http://www.roborealm.com/help/Center%20of%20Gravity.php ... this is the method i would use. You can even perform the thresholding in the COG routine. i.e. cog_x += *imagePtr < 128 ? 255 : 0;
Find the moments of the image to give rotation, center of gravity etc - e.g. "Moments" OpenCV function. (I haven't used this)
(edit) The AForge.NET library has corner detection functions as well as an example project (MotionDetector) and libraries to connect to webcams. I think this would be the easiest way to go, assuming you are using Windows and .NET.

Since no one has posted a complete OpenCV solution, here's a simple approach:
Obtain binary image. We load the image, convert to grayscale, and then obtain a binary image using Otsu's threshold
Find outer contour. We find contours using findContours and then extract the bounding box coordinates using boundingRect
Find center coordinate. Since we have the contour, we can find the center coordinate using moments to extract the centroid of the contour
Here's an example with the bounding box and center point highlighted in green
Input image -> Output
Center: (100, 100)
Center: (200, 200)
Center: (300, 300)
So to recap:
Given an object on a plain white background, does anybody know if OpenCV provides functionality to easily detect an object from a captured frame?
First obtain a binary image (Canny edge detection, simple thresholding, Otsu's threshold, or Adaptive threshold) and then find contours using findContours. To obtain the bounding rectangle coordinates, you can use boundingRect which will give you the coordinates in the form of x,y,w,h. To draw the rectangle, you can draw it with rectangle. This will give you the 4 corner points of the contour. If you wanted to obtain the center point, use
moments to extract the centroid of the contour
Code
import cv2
import numpy as np
# Load image, convert to grayscale, and Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours and extract the bounding rectangle coordintes
# then find moments to obtain the centroid
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Obtain bounding box coordinates and draw rectangle
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
# Find center coordinate and draw center point
M = cv2.moments(c)
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
cv2.circle(image, (cx, cy), 2, (36,255,12), -1)
print('Center: ({}, {})'.format(cx,cy))
cv2.imshow('image', image)
cv2.waitKey()

It is usually called blob analysis in other machine vision libraries. I haven't used opencv yet.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart