I am trying to develop an android application for paper currency identification by capturing images, I have tried template matching method but it is a scale variance, and it doesn't give an accurate match, I am thinking to use calculate Histogram method, will I get a better results?
Also, how can I classify currencies of different colors based on Hue channel ??
This seems a case where a recognition based on SIFT or SURF features can give you good results.
Extract SURF features from those images and build a FlannBasedMatcher (or other matcher). Then, extract SURF features from the input image, use the matcher to compute distances between the input features and those in your training images. Select corresponding features with lower descriptor distance and check if you have enough of them. If your input image has a lot of background, to check if your guess is correct, you can also compute a homography with those correspondences.
There is an example in the OpenCV doc to do something very similar to this.
Related
I have started my first project in the field of Image recognition using Feature Point detectors and descriptors. I have no prior knowledge on the topics of Image recognition techniques before starting of this project and then I have researched on the available detectors and descriptors and came to know about the differences between them. Finally, I have opted out to work with the ORB detectors and descriptors for Image recognition (If it didn't worked according to my requiremnets then I would like to go out with the BRISK later).
As of now am in a stage of getting the results for Image recognition using ORB. At this Point, I was thinking of to use Gaussian Filters in my code so that I can get better results even though the Input Image is a bit blur.
My questions:
1) Is it possible to use Gaussian filters with ORB to get much better results for Image recognition?
2) When I read the paper on ORB I came to know that the lines below
FAST does not produce a measure of cornerness, and we have found that it has large
responses along edges. We employ a Harris corner measure [11] to order the FAST keypoints.
For a target number N of keypoints, we first set the threshold low enough to get more than
N keypoints, then order them according to the Harris measure, and pick the top N points.
FAST does not produce multi-scale features. We employ a scale pyramid of the image, and
produce FAST Features (filtered by Harris) at each level in the pyramid.
ORB provides the Harris Corner inorder to detect the corners in an image and is it worth for me to use Gaussian Filters along with ORB?
3) ORB uses only Harris Corner to detect the corners or any other?
Please let me know about this and just enlighten me on the above mentioned questions.
I get often confused with the meaning of the term descriptor in the context of image features. Is a descriptor the description of the local neighborhood of a point (e.g. a float vector), or is a descriptor the algorithm that outputs the description? Also, what exactly is then the output of a feature-extractor?
I have been asking myself this question for a long time, and the only explanation I came up with is that a descriptor is both, the algorithm and the description. A feature detector is used to detect distinctive points. A feature-extractor, however, does then not seem to make any sense.
So, is a feature descriptor the description or the algorithm that produces the description?
A feature detector is an algorithm which takes an image and outputs locations (i.e. pixel coordinates) of significant areas in your image. An example of this is a corner detector, which outputs the locations of corners in your image but does not tell you any other information about the features detected.
A feature descriptor is an algorithm which takes an image and outputs feature descriptors/feature vectors. Feature descriptors encode interesting information into a series of numbers and act as a sort of numerical "fingerprint" that can be used to differentiate one feature from another. Ideally this information would be invariant under image transformation, so we can find the feature again even if the image is transformed in some way. An example would be SIFT, which encodes information about the local neighbourhood image gradients the numbers of the feature vector. Other examples you can read about are HOG and SURF.
EDIT: When it comes to feature detectors, the "location" might also include a number describing the size or scale of the feature. This is because things that look like corners when "zoomed in" may not look like corners when "zoomed out", and so specifying scale information is important. So instead of just using an (x,y) pair as a location in "image space", you might have a triple (x,y,scale) as location in "scale space".
For the descriptor, I understand as the description of the neighborhood of a point on the image. In other words, it is a vector in the image (descriptions of the visual features of the contents in images).
For example, there is method in the HOG (Histogram of Oriented Gradients) called Image Gradients and Spatial/Orientation Binning. The extractHOGFeatures in Matlab and Classification using HOG had visual examples for better understanding.
I have a lots of images of paper cards of different shades of colors. Like all blues, or all reds, etc. In the images, they are held up to different objects that are of that color.
I want to write a program to compare the color to the shades on the card and choose the closest shade to the object.
however I realize that for future images my camera is going to be subject to lots of different lighting. I think I should convert into HSV space.
I'm also unsure of what type of distance measure I should use. Given some sort of blobs from the cards, I could average over the HSV and simply see which blob's average is the closest.
But I welcome any and all suggestions, I want to learn more about what I can do with OpenCV.
EDIT: A sample
Here I want to compare the filled in red of the 6th dot to see it is actually the shade of the 3rd paper rectangle.
I think one possibility is to do the following:
Color histograms from Hue and Saturation channels
compute the color histogram of the filled circle.
compute color histogram of the bar of paper.
compute a distance using histogram distance measures.
Possibilities here includes:
Chi square,
Earthmover distance,
Bhattacharya distance,
Histogram intersection etc.
Check this opencv link for details on computing histograms
Check this opencv link for details on the histogram comparisons
Note that when computing the color histograms, convert your images to HSV colorspace as you yourself suggested. Then, there is 2 things to note here.
[EDITED to make this a suggestion rather than a must do because I believe V channel might be necessary to differentiate the shades. Anyhow, try both and go with the one giving better result. Apologies if this sent you off track.] One possibility is to only use the Hue and Saturation channels i.e. you build a 2D
histogram rather than a 3D one consisting of values from the hue and
saturation channels. The reason for doing so is that the variation
in lighting is most felt in the V channel. This, together with the
use of histograms, should hopefully make your comparisons more
robust to lighting changes. There is some discussion on ignoring the
V channel when building color histograms in this post here. You
might find the references therein useful.
Normalize the histograms using the opencv functions. This is to
account for the different sizes of the patches of material (your
small circle vs the huge color bar has different number of pixels).
You might also wish to consider performing some form of preprocessing to "stretch" the color in the image e.g. using histogram equalization or an "S curve" mapping so that the different shades of color get better separated. Then compute the color histograms on this processed image. Keep the information for the mapping and perform it on new test samples before computing their color histograms.
Using ML for classification
Besides simply computing the distance and taking the closest one (i.e. a 1 nearest neighbor classifier), you might want to consider training a classifier to do the classification for you. One reason for doing so is that the training of the classifier will hopefully learn some way to differentiate between the different shades of hues since it has access to them during the training phase and is required to differentiate them. Notice that simply computing a distance, i.e. your suggested method, may not have this property. Hopefully this will give better classification.
The features use in the training can still be the color histograms that I mention above. That is, you compute color histograms as described above for your training samples and pass this to the classifier along with their class (i.e. which shade they are). Then, when you wish to classify a test sample, you likewise compute a color histogram and pass it to the classifier and it will return you the class (shade of color in your case) the color of the test sample belongs to.
Potential problems when training a classifier rather than using a simple distance comparison based approach as you have suggested is partly the added complexity of the program as well as potentially getting bad results when the training data is not good. There is also going to be a lot of parameter tuning involved to get it to work well.
See the opencv machine learning tutorials here for more details. Note that in the examples in the link, the classifier only differentiate between 2 classes whereas you have more than 2 shades of color. This is not a problem as the classifiers in general can work with more than 2 classes.
Hope this helps.
I am extracting all images from given PDF files (containing real estate synopses) using the pdfimages tool as jpegs. Now I want to automatically distinguish between photos and other pictures, like maybe the broker's logo. How should I do this?
Is there an open tool that can distinguish between photos and clipart/line drawings etc. like google image search does?
Is there an open tool that gives me the number of colors used for a given jpeg?
I know this will bear a certain uncertainty, but that's okay.
I would look at colour distribution. The colours are likely to be densely packed or "too" evenly spread in the case of gradients. Alternatively, you could look at the frequency distribution of the image.
You can solve your problem in two steps: (1) extract some kind of information from the image and (2) train a classifier that can distinguish the two types of images:
1 - Feature Extraction
In this step you will have to write a program/function that takes a image as input and returns a numeric vector to describe its visual information. As koan points out in his answer, the color distribution contains a lot of useful information. So I would try the following measures:
* Histogram of each color channel (Red, Green, Blue), as this is a basic description of the color distribution of the image;
* Mean, standard deviation and other statistical moments of each histogram. This should give you information on how the colors are distributed in the image. For a drawing, such as logo, the color distribution should be significantly different from a photo;
* Fourier Descriptors. In a drawing, you will probably find a lot edges whereas in a photo this is not expected. With fourier descriptors, you can get this kind of information.
2 - Classification
In this step you will train some sort of classifier. Basically, get a set of images and label each one manually as a drawing or a photo. Also, use your extraction function that you wrote in step 1 to extract vectors from each image. This will be your training set. The training set will be used as input to train a classifier. As Neil N commented, a neural network may be an overkill (or maybe not?), but there are a lot of classifier that you can use (e.g. k-NN, SVM, decision trees). You don't have to implement the classifier yourself, as you can use a machine learning software such as Weka.
Finally, after you have trained your classifier, extract the vector from the image you want test. Use this vector as input to the classifier to get a prediction of whether the image is a photo or a logo.
A simpler solution is to automatically send the image to google image search with the 'similar images' setting on, and see if google sends back primarily PNG results or JPEG results.
What are the ways in which to quantify the texture of a portion of an image? I'm trying to detect areas that are similar in texture in an image, sort of a measure of "how closely similar are they?"
So the question is what information about the image (edge, pixel value, gradient etc.) can be taken as containing its texture information.
Please note that this is not based on template matching.
Wikipedia didn't give much details on actually implementing any of the texture analyses.
Do you want to find two distinct areas in the image that looks the same (same texture) or match a texture in one image to another?
The second is harder due to different radiometry.
Here is a basic scheme of how to measure similarity of areas.
You write a function which as input gets an area in the image and calculates scalar value. Like average brightness. This scalar is called a feature
You write more such functions to obtain about 8 - 30 features. which form together a vector which encodes information about the area in the image
Calculate such vector to both areas that you want to compare
Define similarity function which takes two vectors and output how much they are alike.
You need to focus on steps 2 and 4.
Step 2.: Use the following features: std() of brightness, some kind of corner detector, entropy filter, histogram of edges orientation, histogram of FFT frequencies (x and y directions). Use color information if available.
Step 4. You can use cosine simmilarity, min-max or weighted cosine.
After you implement about 4-6 such features and a similarity function start to run tests. Look at the results and try to understand why or where it doesnt work. Then add a specific feature to cover that topic.
For example if you see that texture with big blobs is regarded as simmilar to texture with tiny blobs then add morphological filter calculated densitiy of objects with size > 20sq pixels.
Iterate the process of identifying problem-design specific feature about 5 times and you will start to get very good results.
I'd suggest to use wavelet analysis. Wavelets are localized in both time and frequency and give a better signal representation using multiresolution analysis than FT does.
Thre is a paper explaining a wavelete approach for texture description. There is also a comparison method.
You might need to slightly modify an algorithm to process images of arbitrary shape.
An interesting approach for this, is to use the Local Binary Patterns.
Here is an basic example and some explanations : http://hanzratech.in/2015/05/30/local-binary-patterns.html
See that method as one of the many different ways to get features from your pictures. It corresponds to the 2nd step of DanielHsH's method.