I am confused how I extract features using histogram normalization. Maybe this sounds too silly but sorry about been silly I am just confused. I have normalized the image but how i represent it as a feature. Is it the histogram becomes the feature or is it the concept of HOG (histogram of Oriented Gradient). I am confused how I represent histogram as a features. Can I get some explanation on this please.
A feature descriptor may be defined as any property of a pattern that describes it in part or in full. When you extract the histogram of an image, you're extracting a few properties of the image that may be applied or exploited in various ways. For example, if you have a majority of the histogram population biased towards the darker side, you can conclude that the image is either filled with dark patterns or is not illuminated sufficiently. So the histogram itself is an effective feature descriptor.
To make histograms comparable, normalisation is usually necessary. For example, you may want to compare images of different sizes. In this case, the overall population of the histograms will be different for the two images. But once you normalise the histograms, they become comparable, which in turn makes the feature description effective and usable.
Related
I've seen transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) both in lots of tutorials and pytorch docs, I know the first param is mean and the second one is std. I can't understand why the values for different channels differ.
That can just be the distribution of the colours in the original dataset of the code authors (such as COCO or PascalVOC). It would only be through chance that all colours are equally represented. However, if you used the same mean in your case, I doubt it would make much of a difference due to the similarity of the means and stds.
For example, in my custom dataset taken from a GoPro camera, the means and standard deviations are as such:
mean: [0.2841186 , 0.32399923, 0.27048702],
std: [0.21937862, 0.26193094, 0.23754872]
where the means are hardly equal. This does not mean, however, that they are treated differently in the ML model. All this transform does is make sure that each feature is standardised (through z-score standardisation).
Think about it this way: if one colour is represented with a generally-higher intensity in your dataset (eg. if you have lots of vibrant "blue" pictures of the beach, the sky, lagoons, etc), than you will have to subtract a larger number from that channel to ensure that your data is standardised.
How can I use one of feature types to detect object - illumination / brightness invariant?
Interested to use features that resistant to:
different lighting
half of the object in the shadow
glare/reflections
Does it make sense to use a HUE (1st component of HSV-color-space), or the average value between the HUE and brightness?
And what is the best feature SIFT/SURF, ORB, BRISK/FREAK, KAZE/AKAZE for brightness-invariant detection?
Sorry for the late response but this answer might be beneficial to someone else.
This is quite a problematic area that I am also facing. Unfortunately I don't know of any feature detector-descriptor combination that is illumination invariant. So some suggestions that you might want to consider involve the following:
You can pre-process the images using the Wallis filter so that both images get to have a balanced brightness level throughout the images.
You can also normalize the descriptor values of the detected features since the descriptors use image intensity values to construct the descriptors. So if the images have different illumination conditions, then corresponding features will also have different descriptors.
I am working on a project to segment air images and classify each segment. The images are very large and have huge homogeneous areas, so I decided to use a Split and Merge Algorithm for the segmentation.
(On the left the original image and on the right the segmented one, where each segment is represented in its RGB mean value Thanks to this answer)
For the classification I want to use a SVM Classifier (I used it a lot in two projects before) with a feature vector.
For the beginning I just want to use five classes: Water, Vegetation, Built up area, Dune and Anomaly
Now I am thinking about what I can put in this feature vector:
The mean RGB Value of the Segment
A texture feature (but can I represent the texture of the segment with just one value?)
The place in the source image (maybe with a value which represents left, right or middle?)
The size of the segment (Water segments should be much larger than built areas)
The mean RGB values of the fourth neighborhood of the segment
So has anyone done something like this and can give me some advises what useful stuff I can put in the feature vector? And can someone give me an advise how I can represent the texture in the segment correctly?
Thank you for your help.
Instead of the Split and Merge algorithm, you could also use superpixels. There are several fast and easy-to-use superpixel algorithms available (some are even implemented in recent OpenCV versions). To name just a view:
Compact Watershed (see here: https://www.tu-chemnitz.de/etit/proaut/forschung/rsrc/cws_pSLIC_ICPR.pdf)
preSLIC and SLIC (see here: https://www.tu-chemnitz.de/etit/proaut/forschung/rsrc/cws_pSLIC_ICPR.pdf and here: http://www.kev-smith.com/papers/SLIC_Superpixels.pdf)
SEEDS (see here: https://arxiv.org/abs/1309.3848)
ERGC (see here: https://sites.google.com/site/pierrebuyssens/code/ergc)
Given the superpixel segmentation, there is a vast set of features you can compute in order to classify them:
In Automatic Photo Pop-Up Table 1, Hoiem et al. consider, among others, the following features: mean RGB color, mean HSV color, color histograms, saturation histograms, Textons, differenty oriented Gaussian derivative filters, mean x and y location, area, ...
In Recovering Occlusion Boundaries from a Single Image, Hoiem et al. consider some additional features to the above list in Table 1.
In SuperParsing: Scalable Nonparametric Image
Parsing with Superpixels
, Tighe et al. additionally consider SIFT histograms, the mask reduced on a 8 x 8 image, boundign box shape, and color thumbnails.
In Class Segmentation and Object Localization with Superpixel Neighborhoods
, Fulkerson et al. also consider features from neighboring superpixels.
Based on the superpixels, you can still apply a simple merging-scheme in order to reduce the number of superpixels. Simple merging by color histograms might already be useful for your tasks. Otherwise you can additionally use edge information in between superpixels for merging.
You don't need to restrict yourself to just 1 feature vector. You could try multiple feature vectors (from the list you already have), and feed them to classifiers based on multiple kernel learning (MKL). MKL has shown to improve the performance over a single feature approach, and one of my favourite MKL techniques is VBpMKL.
If you have time, I would suggest you try one or more the following features, which can capture the features of interest:
Haralick texture features
Histogram oriented gradient features
Gabor filters
SIFT
patch-wise RGB means
I am trying to implement Canny edge detection found hereCanny edge to differentiate objects based on their shapes. I would like to know what are the features? I need to find a score/metric so that I can define a probability from information like mean of the shape. The purpose is to differentiate between objects of different shapes. So, lets assume that the mean shape(x) of Object1 and Object2 are x1,x2 and the standard deviation(s) is s1,s2 respectively. From what do I calculate these information and How do I find these information?
Canny Algorithm is an edge detector. It searches for high frequencies in the image by computing the magnitude of the derivatives in x and y direction. In the end of you have contours of objects. What you are trying to do is to classify objects and using Canny does not sound like a right way to do it, I am not saying you cannot build features out of edges, but it might perform poorly.
In order to achieve what you want, you need first to identify what features are important for you. You mentioned the shape but is the color a good feature for the class of objects you are trying to find? Your pictures show very colorful objects. Are you only trying to distinguish one object to the other (considering the images only display only the object of interest) or do you want locate them in the screen? Does the image contain only one object or multiple ones?
I will give you some direction regarding feature modeling.
If color is a strong information for your objects, you could model your features using histogram information, compute n bins for all objects and store the distribution of the bins as a feature vector. You can use HOG.
Another possible (naive) solution is to compute all colors of patches (e.g. 7x7) belonging to each object and to compute later the histogram over patches instead of single pixels.
If you are not satisfied with color information and you would like to differentiate objects by comparing information in their neighborhood, you can use local binary patterns, which might be enough for the type of information you have.
Once you decide on the features which are important and modeled them, you can go for the classification (which is gonna determine which object you are seeing given a certain feature).
A probabilistic framework tries to estimate the posterior probability P(X|C), i.e. what is the probability of being object X given that we observed C (C could be your feature) and this is very powerful. You might consider reading about Maximum Likelihood Estimation and Maximum a posteriori. Also, a Naive Bayes classifier is a simple off the shelf algorithm available on Opencv that you could use.
You could use many other algorithms, such as SVM, Boost, Decision Trees, Neural Networks and so on. Bag of visual words is also a nice alternative.
If you are interested how to separate the object of interest from the background you are talking about image segmentation, you can look at K-Means or more powerfully Graph Cuts techniques. Of course you can always segment first and then classify the segmented blobs.
Samuel
I have a lots of images of paper cards of different shades of colors. Like all blues, or all reds, etc. In the images, they are held up to different objects that are of that color.
I want to write a program to compare the color to the shades on the card and choose the closest shade to the object.
however I realize that for future images my camera is going to be subject to lots of different lighting. I think I should convert into HSV space.
I'm also unsure of what type of distance measure I should use. Given some sort of blobs from the cards, I could average over the HSV and simply see which blob's average is the closest.
But I welcome any and all suggestions, I want to learn more about what I can do with OpenCV.
EDIT: A sample
Here I want to compare the filled in red of the 6th dot to see it is actually the shade of the 3rd paper rectangle.
I think one possibility is to do the following:
Color histograms from Hue and Saturation channels
compute the color histogram of the filled circle.
compute color histogram of the bar of paper.
compute a distance using histogram distance measures.
Possibilities here includes:
Chi square,
Earthmover distance,
Bhattacharya distance,
Histogram intersection etc.
Check this opencv link for details on computing histograms
Check this opencv link for details on the histogram comparisons
Note that when computing the color histograms, convert your images to HSV colorspace as you yourself suggested. Then, there is 2 things to note here.
[EDITED to make this a suggestion rather than a must do because I believe V channel might be necessary to differentiate the shades. Anyhow, try both and go with the one giving better result. Apologies if this sent you off track.] One possibility is to only use the Hue and Saturation channels i.e. you build a 2D
histogram rather than a 3D one consisting of values from the hue and
saturation channels. The reason for doing so is that the variation
in lighting is most felt in the V channel. This, together with the
use of histograms, should hopefully make your comparisons more
robust to lighting changes. There is some discussion on ignoring the
V channel when building color histograms in this post here. You
might find the references therein useful.
Normalize the histograms using the opencv functions. This is to
account for the different sizes of the patches of material (your
small circle vs the huge color bar has different number of pixels).
You might also wish to consider performing some form of preprocessing to "stretch" the color in the image e.g. using histogram equalization or an "S curve" mapping so that the different shades of color get better separated. Then compute the color histograms on this processed image. Keep the information for the mapping and perform it on new test samples before computing their color histograms.
Using ML for classification
Besides simply computing the distance and taking the closest one (i.e. a 1 nearest neighbor classifier), you might want to consider training a classifier to do the classification for you. One reason for doing so is that the training of the classifier will hopefully learn some way to differentiate between the different shades of hues since it has access to them during the training phase and is required to differentiate them. Notice that simply computing a distance, i.e. your suggested method, may not have this property. Hopefully this will give better classification.
The features use in the training can still be the color histograms that I mention above. That is, you compute color histograms as described above for your training samples and pass this to the classifier along with their class (i.e. which shade they are). Then, when you wish to classify a test sample, you likewise compute a color histogram and pass it to the classifier and it will return you the class (shade of color in your case) the color of the test sample belongs to.
Potential problems when training a classifier rather than using a simple distance comparison based approach as you have suggested is partly the added complexity of the program as well as potentially getting bad results when the training data is not good. There is also going to be a lot of parameter tuning involved to get it to work well.
See the opencv machine learning tutorials here for more details. Note that in the examples in the link, the classifier only differentiate between 2 classes whereas you have more than 2 shades of color. This is not a problem as the classifiers in general can work with more than 2 classes.
Hope this helps.