Search images from a webcam (Google Goggles style) - opencv

I'd like to find images, from a database I have, using a webcam.
Specifically, I'd like to set up a "price kiosk" where people can walk up with an item, put it in front of the camera, and have it search on the database for the price. For seveal reasons (ease of use being the most important) I don't want to use the barcodes in products.
The items are relatively easy to scan (they are, for practical purposes, 2D: they are comic books). I have all the covers already scanned. So what I'd like is some way to take the image from the webcam and use it as a source for the search. Of course the image will be distorted (angle, focus, resolution, lighting, rotation, etc). This isn't a problem for Google Goggles (Google Images really), as I've scanned comic book covers in a number of conditions and it's able to find them.
Now, i've been doing some research. I've seem pretty awesome things done with OpenCV, which makes me think this shouldn't be too difficult to implement. Especially considering my dataset is much smaller (about 2000 different products) than google's.
What am I looking for, specifically? Object detection, recognition, features...? I'm confused and I don't even know where to start.

Read up on SIFT: (Scale invariant feature transform)

Related

How to collect and filter images for image recognition?

Newbee question here: right now I need a relatively large database of images, say a few thousand, for image recognition training purposes.
I found some good ways to get rough results, like playing with google images search terms e.t.c. and binge-downloading them. But the results (images) are still only like 50% representative of what I would need on the image (other images are either related things, or plain garbage). Picking by hand isn't really an option, because it would take a whole lot of time. So is there a quicker way of picking those images? Like using some other image recognition network, or specific software?
You can search Kaggle for image recognition competitions such as cdiscount-image-classification-challenge.
The data is usually available easily.

How can I identify / classify objects in a low-resolution image?

What image recognition technology is good at identifying a low resolution object?
Specifically I want to match a low resolution object to a particular item in my database.
Specific Example:
Given a picture with bottles of wine, I want to identify which wines are in it. Here's an example picture:
I already have a database of high resolution labels to match to.
Given a high-res picture of an individual bottle of wine - it was very easy to match it to its label using a Vuforia (service for some image recongition). However the service doesn't work well for lower resolution matching, like the bottles in the example image.
Research:
I'm new to this area of programming, so apologies for any ambiguities or obvious answers to this question. I've been researching, but theres a huge breadth of technologies out there for image recognition. Evaluating each one takes a significant amount of time, so I'll try to keep this question updated as I research them.
OpenCV: seems to be the most popular open source computer vision library. Many modules, not sure which are applicable yet.
haar-cascade feature detection: helps with pre-processing an image by orienting a component correctly (e.g. making a wine label vertical)
OCR: good for reading text at decent resolutions - not good for low-resolution labels where a lot of text is not visible
Vuforia: a hosted service that does some types of image recognition. Mostly meant for augmented reality developers. Doesn't offer control over algorithm. Doesn't work for this kind of resolution

How to recognize the color of a poker card using OpenCV

I am currently willing to implement an iOS app that uses OCR to compute poker stats (you put your cards on a table, then take a picture with your iPhone camera and then magic happens). I know that OpenCV for iOS is the way to go but I don't find any code sample to also recognize the color (spade, heart, club, diamond) of the cards. How can I do it?
There are different ways of "understanding" the picture and each way has it's own pros and cons. template matching will not be a good idea since the cards are different and simply a very round heart and somewhat sharp and pointy heart would be the same but for template matching it would be a totally different "heart" , If you are sure that the user is going to input 2 cards than you would rather crop the cards and separate them. This can be done with simply snap color detection ( use canny edge detector to detect edges). Then you want to search for all the suits and find which one got the best result. You can use the "BOW" (bag of words approach) (google it a little bit) it's about building a visual vocabulary and simply with the frequency of visual words you must be able to tell which is which.
Generally nothing can give you a 100% guarantee but with BOW you can pull out some interesting results.

Recognising a drawn line using neural networks in a web app

Basically, I was weighing up some options for a software idea I had. The web app thing is a bit of a constraint on the project, so I'm assuming I would be writing this in js.
I need to create a drawable area for the user, which is okay, allow them to draw and then compare the input to a correct example. This is just an arrow, but the arrow can be double headed (normal point arrow) or single headed (half an arrowhead), so the minute details are fairly important, as is the location.
Now, I've read around for a few hours or so, and it seems to be that a good approach is to downsample the input so I am just comparing a couple of pixels. I am wondering though if there is a simpler way to achieve what I want here, and if there are good resources for learning what I feel is a very basic implementation of image recognition. Also having never implemented something like this, I'm a little worried about the little details of something like this, like speed; obviously feedback has to be fairly quick.
Thanks.
Use openCV. It already has the kind of use cases you want (location, style etc. of the image). There are many other open source libraries but not many as robust as this.
After that you have to decide all the possible images you want to make as the standard image, then get training examples for each of these standard images (each of these std images would be your one single class).
Now use the pixels as the features (openCV will do it for you with minimum help) and do your classification training. Not you have to provide these training images and have at least a good amount of training images for each class. Then use this trained classifier to classify the images that are drawn by your users. You can put GUI on top of it to adapt to your needs that you posted above.

How can I use computer vision to find a shape in an image?

I have a simple photograph that may or may not include a logo image. I'm trying to identify whether a picture includes the logo shape or not. The logo (rectangular shape with a few extra features) could be of various sizes and could have multiple occurrences. I'd like to use Computer Vision techniques to identify the location of these logo occurrences. Can someone point me in the right direction (algorithm, technique?) that can be used to achieve this goal?
I'm quite a novice to Computer Vision so any direction would be very appreciative.
Thanks!
Practical issues
Since you need a scale-invariant method (that's the proper jargon for "could be of various sizes") SIFT (as mentioned in Logo recognition in images, thanks overrider!) is a good first choice, it's very popular these days and is worth a try. You can find here some code to download. If you cannot use Matlab, you should probably go with OpenCV. Even if you end up discarding SIFT for some reason, trying to make it work will teach you a few important things about object recognition.
General description and lingo
This section is mostly here to introduce you to a few important buzzwords, by describing a broad class of object detection methods, so that you can go and look these things up. Important: there are many other methods that do not fall in this class. We'll call this class "feature-based detection".
So first you go and find features in your image. These are characteristic points of the image (corners and line crossings are good examples) that have a lot of invariances: whatever reasonable processing you do to to your image (scaling, rotation, brightness change, adding a bit of noise, etc) it will not change the fact that there is a corner in a certain point. "Pixel value" or "vertical lines" are bad features. Sometimes a feature will include some numbers (e.g. the prominence of a corner) in addition to a position.
Then you do some clean-up, like remove features that are not strong enough.
Then you go to your database. That's something you've built in advance, usually by taking several nice and clean images of whatever you are trying to find, running you feature detection on them, cleaning things up, and arrange them in some data structure for your next stage —
Look-up. You have to take a bunch of features form your image and try to match them against your database: do they correspond to an object you are looking for? This is pretty non-trivial, since on the face of it you have to consider all subsets of the bunch of features you've found, which is exponential. So there are all kinds of smart hashing techniques to do it, like Hough transform and Geometric hashing.
Now you should do some verification. You have found some places in the image which are suspect: it's probable that they contain your object. Usually, you know what is the presumed size, orientation, and position of your object, and you can use something simple (like a convolution) to check if it's really there.
You end up with a bunch of probabilities, basically: for a few locations, how probable it is that your object is there. Here you do some outlier detection. If you expect only 1-2 occurrences of your object, you'll look for the largest probabilities that stand out, and take only these points. If you expect many occurrences (like face detection on a photo of a bunch of people), you'll look for very low probabilities and discard them.
That's it, you are done!

Resources