I was wondering why in OpenCV examples when it comes to meanshift tracking, only Hue channel is used.
In https://docs.opencv.org/4.x/d7/d00/tutorial_meanshift.html such line of code implies what I wrote:
roi_hist = cv.calcHist([hsv_roi],[0],mask,[180],[0,180])
I understand main idea to convert RGB color space to HSV, but I do not get why only selecting Hue is enough. I know that roi_hist is later used to create back projection, but I also know that it is possible to create 2-D roi_hist by selecting also Saturation.
What it depends on? Should I expected that adding Saturation will improve my tracking results? I want to perform face tracking so I am looking for skin color.
Thanks in advance for help.
The OpenCV tutorial you linked references the paper that introduces CAMSHIFT. The CAMSHIFT algorithm was designed to track human faces. On page three, the paper states:
Except
for albinos, humans are all the same color (hue). Dark-
skinned people simply have greater flesh color saturation
than light-skinned people, and this is separated out in the
HSV color system and ignored in our flesh-tracking color
model.
The use of the H in HSV allows for a single channel tracker that works for most human faces.
Related
I am using background subtraction method to detect moving objects. Because their type in my experiment is reflective material object, so it causes difficulty for detecting. How could I resolve it?
Thank you!
EDIT: I'm using Background subtraction MOG2 (in OpenCV). OpenCV version is 3.10
EDIT 1: Updated the result when apply to HSV colour space
Step 1: Convert to HSV colour space
Step 2: Apply MoG2
I'm assuming your camera is non-moving, you know the background model and you are using something like MOG detector. The simplest approach is to use color space that separates luminance from hue and saturation - one such example is HSV color space. OpenCV provides cvtColor function to convert i.e. form BGR (default) to HSV color space. Later you can use just hue and saturation channel to avoid influence of value variations (light). This however won't work for extremely shiny objects, like plastic or shiny metal lit by sunlight that appears to be white to the camera.
Another way you can deal with this problem is to use motion tracking - i.e. optical flow. If you are really interested and want to get more into details, I can refer you to some specific papers.
Let me quickly explain what I have: I have written a custom detector that finds the regions in an image of billiard balls. I did this in using the HSV colorspace and for most ball's I could get away with only thresholding the Hue channel. However for orange (#5) and brown (#7) one must take the saturation into account which adds another dimension to the problem.
From my research it seems like my best route would be to do some manner of mean-shift tracking but everything I've come across has described mean-shift in which only one channel is used (the hue channel).
Can anyone please explain or offer a link explaing how I can adapt mean-shift to work using hue and saturation?
Or can you tell me if you think a different tracking algorithm may be better suited to this problem?
In theory mean shift works well regardless of the dimensionality (in very high dimensions sparseness is a bit of an issues, but there are works that address that problem)
If you are trying to use an off the self mean shift tracker that only takes a single channel input, you can create your own problem specific color channel. You need a single channel that maximizes the difference between the different colored billiard balls.
The easiest way of doing that will be to take the mean colors of all 15 balls and, put them in a 15x3 matrix and decompose it with SVD (subtract the mean first) so you'll get the axis of maximal variance. This will give you the best linear transformation from RGB to a new one dimensional color space that maximizes difference between the billiard balls colors. (If it isn't good enough you can do better with local mapping, but might not be necessary)
I have a lots of images of paper cards of different shades of colors. Like all blues, or all reds, etc. In the images, they are held up to different objects that are of that color.
I want to write a program to compare the color to the shades on the card and choose the closest shade to the object.
however I realize that for future images my camera is going to be subject to lots of different lighting. I think I should convert into HSV space.
I'm also unsure of what type of distance measure I should use. Given some sort of blobs from the cards, I could average over the HSV and simply see which blob's average is the closest.
But I welcome any and all suggestions, I want to learn more about what I can do with OpenCV.
EDIT: A sample
Here I want to compare the filled in red of the 6th dot to see it is actually the shade of the 3rd paper rectangle.
I think one possibility is to do the following:
Color histograms from Hue and Saturation channels
compute the color histogram of the filled circle.
compute color histogram of the bar of paper.
compute a distance using histogram distance measures.
Possibilities here includes:
Chi square,
Earthmover distance,
Bhattacharya distance,
Histogram intersection etc.
Check this opencv link for details on computing histograms
Check this opencv link for details on the histogram comparisons
Note that when computing the color histograms, convert your images to HSV colorspace as you yourself suggested. Then, there is 2 things to note here.
[EDITED to make this a suggestion rather than a must do because I believe V channel might be necessary to differentiate the shades. Anyhow, try both and go with the one giving better result. Apologies if this sent you off track.] One possibility is to only use the Hue and Saturation channels i.e. you build a 2D
histogram rather than a 3D one consisting of values from the hue and
saturation channels. The reason for doing so is that the variation
in lighting is most felt in the V channel. This, together with the
use of histograms, should hopefully make your comparisons more
robust to lighting changes. There is some discussion on ignoring the
V channel when building color histograms in this post here. You
might find the references therein useful.
Normalize the histograms using the opencv functions. This is to
account for the different sizes of the patches of material (your
small circle vs the huge color bar has different number of pixels).
You might also wish to consider performing some form of preprocessing to "stretch" the color in the image e.g. using histogram equalization or an "S curve" mapping so that the different shades of color get better separated. Then compute the color histograms on this processed image. Keep the information for the mapping and perform it on new test samples before computing their color histograms.
Using ML for classification
Besides simply computing the distance and taking the closest one (i.e. a 1 nearest neighbor classifier), you might want to consider training a classifier to do the classification for you. One reason for doing so is that the training of the classifier will hopefully learn some way to differentiate between the different shades of hues since it has access to them during the training phase and is required to differentiate them. Notice that simply computing a distance, i.e. your suggested method, may not have this property. Hopefully this will give better classification.
The features use in the training can still be the color histograms that I mention above. That is, you compute color histograms as described above for your training samples and pass this to the classifier along with their class (i.e. which shade they are). Then, when you wish to classify a test sample, you likewise compute a color histogram and pass it to the classifier and it will return you the class (shade of color in your case) the color of the test sample belongs to.
Potential problems when training a classifier rather than using a simple distance comparison based approach as you have suggested is partly the added complexity of the program as well as potentially getting bad results when the training data is not good. There is also going to be a lot of parameter tuning involved to get it to work well.
See the opencv machine learning tutorials here for more details. Note that in the examples in the link, the classifier only differentiate between 2 classes whereas you have more than 2 shades of color. This is not a problem as the classifiers in general can work with more than 2 classes.
Hope this helps.
I can't find the best way to detect red color in different illumination or background.
I found that there's YCbCr color space which is good for red or blue color detection (actually I need to detect blue color too). The problem is that I can't figure out which threshold to use in different lightning. For example in sunny weather this threshold equals 210 (from 255), when in cloudly weather this threshold equals 130.
I use OpenCV library to implement this.
Thanks for any help or advice.
Yes, HSV is usually used for such purpose. In HSV you can tell that whatever is brightness etc, red is what is needed. I also recommend to look into two places. One is simple tutorial http://aishack.in/tutorials/tracking-colored-objects-in-opencv/ and another is to take a book Learning OpenCV and use examples of histograms from there. They do exactly what you need. Using HSV and Histograms makes your solution solid.
HSV color space should be more robust to change of illumination.
I'm working on a way to detect the floor in an image. I'm trying to accomplish this by reducing the image to areas of color and then assuming that the largest area is the floor. (We get to make some pretty extensive assumptions about the environment the robot will operate in)
What I'm looking for is some recommendations on algorithms that would be suited to this problem. Any help would be greatly appreciated.
Edit: specifically I am looking for an image segmentation algorithm that can reliably extract one area. Everything I've tried (mainly PyrSegmentation) seems to work by reducing the image to N colors. This is causing false positives when the camera is looking at an empty area.
Since floor detection is the main aim, I'd say instead of segmenting by color, you could try separation by texture.
The Eigen transform paper describes a single-value descriptor of texture "roughness" using the average of eigenvalues over a grayscale window in the image/video frame. On pg. 78 of the paper they apply the mean-shift segmentation on the eigen-transform output image, effectively separating it into different textures.
Since your images are from a video feed, there can be a lot of variations in lighting so color segmentation might pose a few problems (unless you're working with HSV and other color spaces as mentioned above). The calculation of the eigenvalues is very simple and fast in OpenCV with the cvSVD() function.
If you can make the assumption about colour constancy your main issue is going to be changes in lighting that will throw off your colour detection.
To that end, convert your input image to HSV, HSL, cie-Lab, YUV or some other luminance-separated colourspace and segment your image based on just the colour part (leave out the luminance value, V, L, L and Y in the examples above). This will help you overcome the obstacle of shadows and variations in lighting.