My objective -
Input: A PNG floorplan (with many electrical equipment symbols on it), and a user who selects one of those symbols using a bounding box.
Output: The same PNG floorplan but with all matching symbols highlighted
I have been looking into feature detection as a way to find matching symbols, but I can't find any examples online of it used on 2D objects- I only ever see it used on photos or used live in videos. Does Feature Detection work for 2D objects as well? If not, why not?
For those interested, I have been developing in C#, using an OpenCV wrapper API called Emgu CV (it has all the OpenCV functions and some more).
You can take a look at research on logo recognition. You can use classical features detector such as sift or surf and then calculate from the extracted features some shape invariants like features triangle orientation.
Here is a classic paper to take a look for some ideas:
Scalable Logo Recognition
I guess your input is a binary / grayscale image with mainly lines, arrows, circles, ...
Local Feature Detection (eg ORB, SURF, SIFT, ...) is best suited for "high entropy" images, I mean photos of scenes with a lot of texture.
Here you have geometrical shapes, a geometrical method would be better. I think a (geometric) shape detection algorithm would be a better choice, such as the Generalized Hough Transform.
Related
I am working on a hand detection project. There are many good project on web to do this, but what I need is a specific hand pose detection. It needs a totally open palm and the whole palm face to outwards, like the image below:
The first hand faces to inwards, so it will not be detected, and the right one faces to outwards, it will be detected. Now I can detect hand with OpenCV. but how to tell the hand orientation?
Problem of matching with the forehand belongs to the texture classification, it's a classic pattern recognition problem. I suggest you to try one of the following methods:
Gabor filters: it is good to detect the orientation and pixel intensities (as forehand has different features), opencv has getGaborKernel function, the very important params of this function is theta (orientation) and lambd: (frequencies). To make it simple you can apply this process on a cropped zone of palm (as you have already detected it, it would be easy to crop for example the thumb, or a rectangular zone around the gravity center..etc). Then you can convolute it with a small database of images of the same zone to get the a rate of matching, or you can use the SVM classifier, where you have to train your SVM on a set of images by constructing the training matrix needed for SVM (check this question), this paper
Local Binary Patterns (LBP): it's an important feature descriptor used for texture matching, you can apply it on whole palm image or on a cropped zone or finger of image, it's easy to use in opencv, a lot of tutorials with codes are available for this method. I recommend you to read this paper talking about Invariant Texture Classification
with Local Binary Patterns. here is a good tutorial
Haralick Texture: I've read that it works perfectly when a set of features quantifies the entire image (Global Feature Descriptors). it's not implemented in opencv but easy to be implemented, check this useful tutorial
Training Models: I've already suggested a SVM classifier, to be coupled with some descriptor, that can works perfectly.
Opencv has an interesting FaceRecognizer class for face recognition, it could be an interesting idea to use it replacing the face images by the palm ones, (do resizing and rotation to get an unique pose of palm), this class has three methods can be used, one of them is Local Binary Patterns Histograms, which is recommended for texture recognition. and why not to try the other models (Eigenfaces and Fisherfaces ) , check this tutorial
well if you go for a MacGyver way you can notice that the left hand has bones sticking out in a certain direction, while the right hand has all finger lines and a few lines in the hand palms.
These lines are always sort of the same, so you could try to detect them with opencv edge detection or hough lines. Due to the dark color of the lines, you might even be able to threshold them out of it. Then gather the information from those lines, like angles, regressions, see which features you can collect and train a simple decision tree.
That was assuming you do not have enough data, if you have then you go into deeplearning, just take a basic inceptionV3 model and retrain the last dense layer to classify between two classes with a softmax, or to predict the probablity if the hand being up/down with sigmoid. Check this link, Tensorflow got your back on the training of this one, pure already ready code to execute.
Questions? Ask away
Take a look at what leap frog has done with the oculus rift. I'm not sure what they're using internally to segment hand poses, but there is another paper that produces hand poses effectively. If you have a stereo camera setup, you can use this paper's methods: https://arxiv.org/pdf/1610.07214.pdf.
The only promising solutions I've seen for mono camera train on large datasets.
use Haar-Cascade classifier,
you can get the classifier model file then use it here.
Just search for 'Haarcascade detection of Palm in Google' or use below code.
import cv2
cam=cv2.VideoCapture(0)
ccfr2=cv2.CascadeClassifier('haar-cascade-files-master/palm.xml')
while True:
retval,image=cam.read()
grey=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
palm=ccfr2.detectMultiScale(grey,scaleFactor=1.05,minNeighbors=3)
for x,y,w,h in palm:
image=cv2.rectangle(image,(x,y),(x+w,y+h),(256,256,256),2)
cv2.imshow("Window",image)
if cv2.waitKey(1) & 0xFF==ord('q'):
cv2.destroyAllWindows()
break
del(cam)
Best of Luck for your experience using HaarCascade.
I'm trying to detect shapes written on a whiteboard with a black/blue/red/green marker. The shapes can be circles, rectangles or triangles. The image can be found at the bottom of this post.
I'm using OpenCV as the framework for the image recognition.
My first task is to research and list the different strategies that could be used for the detection. So far I have found the following:
1) Grayscale, Blur, Canny Edge, Contour detection, and then some logic to determine if the contours detected are shapes?
2) Haar training with different features for shapes
3) SVM classification
4) Grayscale, Blur, Canny Edge, Hough transformation and some sort of color segmentation?
Are there any other strategies that I have missed? Any newer articles or tested approaches? How would you do it?
One of the test pictures: https://drive.google.com/file/d/0B6Fm7aj1SzBlZWJFZm04czlmWWc/view?usp=sharing
UPDATE:
The first strategy seems to work the best, but is far from perfect. Issues arise when boxes are not closed, or when the whiteboard has a lot of noise. Haar training does not seems very effective because of the simple shapes to detect without many specific features. I have not tried CNN yet, but it seems most appropriate to image classification, and not so much to detect shapes in a larger image (but I'm not sure)
I think that the first option should work. You can use fourier descriptors in order to classify the segmented shapes.
http://www.isy.liu.se/cvl/edu/TSBB08/lectures/DBgrkX1.pdf
Also, maybe you can find something useful here:
http://www.pyimagesearch.com/2016/02/08/opencv-shape-detection/
If you want to try a more challenging but modern approach, consider deep learning approach (I would start with CNN). There are many implementations available on the internet. Although it is probably an overkill for this specific project, it might help you in the future...
Context:
I have the RGB-D video from a Kinect, which is aimed straight down at a table. There is a library of around 12 objects I need to identify, alone or several at a time. I have been working with SURF extraction and detection from the RGB image, preprocessing by downscaling to 320x240, grayscale, stretching the contrast and balancing the histogram before applying SURF. I built a lasso tool to choose among detected keypoints in a still of the video image. Then those keypoints are used to build object descriptors which are used to identify objects in the live video feed.
Problem:
SURF examples show successful identification of objects with a decent amount of text-like feature detail eg. logos and patterns. The objects I need to identify are relatively plain but have distinctive geometry. The SURF features found in my stills are sometimes consistent but mostly unimportant surface features. For instance, say I have a wooden cube. SURF detects a few bits of grain on one face, then fails on other faces. I need to detect (something like) that there are four corners at equal distances and right angles. None of my objects has much of a pattern but all have distinctive symmetric geometry and color. Think cellphone, lollipop, knife, bowling pin. My thought was that I could build object descriptors for each significantly different-looking orientation of the object, eg. two descriptors for a bowling pin: one standing up and one laying down. For a cellphone, one laying on the front and one on the back. My recognizer needs rotational invariance and some degree of scale invariance in case objects are stacked. Ability to deal with some occlusion is preferable (SURF behaves well enough) but not the most important characteristic. Skew invariance would be preferable and SURF does well with paper printouts of my objects held by hand at a skew.
Questions:
Am I using the wrong SURF parameters to find features at the wrong scale? Is there a better algorithm for this kind of object identification? Is there something as readily usable as SURF that uses the depth data from the Kinect along with or instead of the RGB data?
I was doing something similar for a project, and ended up using a super simple method for object recognition, which was using OpenCV blob detection, and recognizing objects based on their areas. Obviously, there needs to be enough variance for this method to work.
You can see my results here: http://portfolio.jackkalish.com/Secondhand-Stories
I know there are other methods out there, one possible solution for you could be approxPolyDP, which is described here:
How to detect simple geometric shapes using OpenCV
Would love to hear about your progress on this!
For a project I've to detect a pattern and track it in space despite rotation, noise, etc.
It's highlighted with IR light and recorded with an IR camera:
Picture: https://i.stack.imgur.com/RJuVS.png
As on this picture it will be only very simple shape and we can choose which one we're gonna use.
I need direction on how to process a recognition of these shapes please.
What I do currently is thresholding and erosion to get a cleaner shape and then a contour detection and a polygon approximation.
What should I do then? I tried hu-moments but it wasn't good at all.
Could you please give me a global approach to recognize and track such pattern in space?
Can you choose which shape to project?
if so I would recomend using few concentric circles. Then using hough transform for circles you can easily find the center of the shape even when tracking is extremly hard (large movement/low frame rate).
If you must use rectangular shape then there is a good open source which does that. It is part of a project to read street signs and auto-translate them.
Here is a link: http://code.google.com/p/signfinder/
This source is not large and it would be easy to cut out the relevant part.
It uses "good features to track" of openCV in module CornerFinder.
Hope it helped
It is possible, you need following steps: thresholding image, some morphological enhancement,
blob extraction and normalization of blob size, blobs shape analysis, comparison of analysis results with pattern that you want to track.
There is many methods for blobs shape analysis. Simple methods: geometric dimensions, area, perimeter, circularity measurement; bit quads and others (for example, William K. Pratt "Digital Image Processing", chapter 18). Complex methods: spacial moments, template matching, neural networks and others.
In any event, it is very hard to answer exactly without knowledge of pattern shapes that you want to track )
hope it helped
How can I detect irises in a face with opencv?
Have a look at this forum thread. There's some source code there to get you started, but be careful about using it directly -- the original author seemed to have problems compiling it.
Start with detecting circles - see cvHoughCircles - hint, eyes have a series of concentric circles.
OpenCV has Face Detection module which uses Haar Cascade. You can use the same method to detect Iris. You collect some iris images and make it as positive set and non iris images as negative set. The use the Haar Training module to train it.
Quick and dirty would be making an eye detection first with Haar filter, there are good model xml files shipped with opencv 2.4.2. Then you do some skin detection (in the HSV space rather than the rgb space) to identify the area of the eye in the middle, or circle search.
Also, projections, histogram-based decisions can be used once the eye area is cropped.