Image processing: Recognise multiple instances of the same objects in an image - image-processing

I am working on a project where I have to recognize objects in a grocery shells. You can see the sample image below:
I need to find what products exists in an image. The example of result image is shown below:
OpenCV tools like SURF, SIFT, ORB detects only one occurrence of the object in an image. Can you suggest some papers or tools to solve this problem.

Normally there are multiple techniques to detect multiple instances of the same object in an image.
The most primitive way to do that is template matching. So you create a database of training images at multiple scales and rotations to be able to detect such objects in varying conditions. But there are many techniques that are better than such legacy technique.
Some other techniques uses texture features that is invariant over scale, rotation, or both. For example, GLCM, LBP, HOG, SIFT, ORB and others.
Your statement OpenCV tools like SURF, SIFT, ORB detects only one occurrence of the object in an image. needs more enhancement.
The listed tools are not intended to detect objects but they are means to extract texture features.
You are the one to adjust them to detect multiple objects.
There is a more fine way to solve your problem. It seems that all of your objects that are required to be detected contains the text TASSAY.
you can easily extract that text using a group of morphological operations and then using a blob detector detect the location of the text.
After returning the text, it can be easily to measure the text location.
The object bounding box can be easily inferred from the text location.
Hope it helps.

Related

How to detect and recognize a specific object in an image

I want to detect Or recognize a specific object in an image. First of all say what I have done. I tried to detect a LOGO e.g Google LOGO, I have the original image of the LOGO, but in the images which I am going to process are taken with different cameras from different angle and from different distance and from different screens (wide screen like cinema).
I am using OpenCV 3 to check whether this LOGO is in these images, I have tried the OpenCV SURF, SIFT etc functions and also tried NORM_L2 algorithm, which compares two images and template matching and also used SVM (it was so slow and not correct detection) and some other OpenCV functions, but no one was good to use. Then I did my own algorithm which is working better than the above functions, but also cannot satisfy the requirements.
Now my question is: Is there any better way to detect the specific object in an image? For example: what should I do at the first and second... steps?

Wikitude/AR SDK's to Pick out objects in 3d space

Im looking at integrating an AR Kit into our iOS App so we can use the camera to scan a room or field of view for objects. Above is an example of what i mean, if you were to bring up the camera it would highlight the separate objects in the room and allow them to be clicked and "added" into the system.
Does anyone know if this is achievable with the current AR kits or anything else out there? It all seems to be the fact that objects that you are looking for have to be pre-defined and loaded into a database so the app can find them. Im hoping it should be able to pick out the objects realtime. It doesnt need to actually know any details on the actual object just so that can be pulled off the base scenary.
Any ideas?
OpenCV library (iOS) contains many algorithms to compare different image blobs. If you want to match some simple template to find objects then try Viola & Jones algorithm and so called Haar cascades. OpenCV has trained collection of templates in XML files for detecting faces for example. OpenCV contains utility for training thus you are able to generate cascades for other kinds of objects.
Some example projects:
https://github.com/alexmac/alcexamples/blob/master/OpenCV-2.4.2/doc/user_guide/ug_traincascade.rst Cascade Classifier Training
https://github.com/lukagabric/iOS-OpenCV Example code for detecting Colors and Circle shapes
https://github.com/BloodAxe/OpenCV-Tutorial Feature Detection (SURF, ORB, FREAK)
https://github.com/foundry/OpenCVSquaresSL Square Detection using Pyramid scaling, Canny, contours, contour simpification

Object Recognition by Outlines vs Features

Context:
I have the RGB-D video from a Kinect, which is aimed straight down at a table. There is a library of around 12 objects I need to identify, alone or several at a time. I have been working with SURF extraction and detection from the RGB image, preprocessing by downscaling to 320x240, grayscale, stretching the contrast and balancing the histogram before applying SURF. I built a lasso tool to choose among detected keypoints in a still of the video image. Then those keypoints are used to build object descriptors which are used to identify objects in the live video feed.
Problem:
SURF examples show successful identification of objects with a decent amount of text-like feature detail eg. logos and patterns. The objects I need to identify are relatively plain but have distinctive geometry. The SURF features found in my stills are sometimes consistent but mostly unimportant surface features. For instance, say I have a wooden cube. SURF detects a few bits of grain on one face, then fails on other faces. I need to detect (something like) that there are four corners at equal distances and right angles. None of my objects has much of a pattern but all have distinctive symmetric geometry and color. Think cellphone, lollipop, knife, bowling pin. My thought was that I could build object descriptors for each significantly different-looking orientation of the object, eg. two descriptors for a bowling pin: one standing up and one laying down. For a cellphone, one laying on the front and one on the back. My recognizer needs rotational invariance and some degree of scale invariance in case objects are stacked. Ability to deal with some occlusion is preferable (SURF behaves well enough) but not the most important characteristic. Skew invariance would be preferable and SURF does well with paper printouts of my objects held by hand at a skew.
Questions:
Am I using the wrong SURF parameters to find features at the wrong scale? Is there a better algorithm for this kind of object identification? Is there something as readily usable as SURF that uses the depth data from the Kinect along with or instead of the RGB data?
I was doing something similar for a project, and ended up using a super simple method for object recognition, which was using OpenCV blob detection, and recognizing objects based on their areas. Obviously, there needs to be enough variance for this method to work.
You can see my results here: http://portfolio.jackkalish.com/Secondhand-Stories
I know there are other methods out there, one possible solution for you could be approxPolyDP, which is described here:
How to detect simple geometric shapes using OpenCV
Would love to hear about your progress on this!

OpenCV feature detection for recognition of multiple different images

My question is - can I recognize different templates in a source image using feature detection in OpenCV? Let's say my templates are road signs.
I am using ORB, but this is not tracker-specific question.
My basic approach without feature detection is:
Image preparation (filtering etc);
Detecting ROI where my object may be located;
Resizing ROI to templates' size and comparing with each template I have (ie. template matching);
Maximum correlation after comparison is an object I look for.
But with feature detection I detect keypoints and descriptors for each image in my template set and for my ROI where object might be located, but matcher returns distances for all descriptors I have in my ROI.
I can't tie this to any correlation between ROI and templates, or, in other words, I can't decide whether ROI image and template image are the same objects based on information provided by matcher.
So, to be more specific - is my approach wrong and feature detectors are used to detect one template object in a source image (which is not what I need) or I'm just not grasping the basic concepts of feature detection and thus am in need of help.
You may be missing two aspects. One is to remove outliers in your feature matching using a method like RANSAC+homography. The second point is to project the corners of your template to the scene, to make a "rectangle" of your image. Also you should define a threshold on how many inliers you will consider the minimum for a right detection.
Check this tutorial on finding objects with feature detection.
I will refer you to a book called:
'opencv2 computer vision application programming cookbook'
Just browse the relevant chapters.

object recognition performance not good

I am trying to do object recognition using algorithms such as SURF, FERN, FREAK in opencv 2.4.2.
I am using the programs from opencv samples without modifications - find_obj.cpp, find_obj_ferns.cpp, freak_demo.cpp
I tried changing the parameters for the algorithms which didn't help.
I have my training images, test images and the result of FREAK recognition here
As you can see the result is pretty bad.
No feature descriptors is detected for one of the training image - image here
Feature descriptors are detected outside the object boundary for the other - image here
I have a few questions:
Why does these algorithms work with grayscale images ? It is apparent that for my above training images, the object can be detected easily if RGB is included. Is there any technique that takes this into account.
Is there any other way to improve performance. I tried fiddling with feature parameters which didn't work well.
First thing i observed in your image is, object is plane and no texture differences are there...I mean all the feature detectors you used are for finding corners which are view invariant, it means those are the keypoints in an image which are having unique neighborhood and good magnitude of x and y derivatives. I have uploaded my analysis...see the figures
How to know what I am saying is correct?
Just go to the descriptor values of a keypoint you find over your object and see the values, you will see most of them are zeros...Because a descriptor is the description of variation of the edges around a corner point in a specific direction (see surf documentation for more details).
The object you are trying to detect is looking like a mobile phone, so you just reverse the object or mobile and repeat the experiment and you will surely get good results...Because on front side generally objects have more texture like switches, logos etc..
Here is a result I uploaded,

Resources