Opencv Motion detection with tracking - opencv

I need a robust motion detection and tracking in web cam's video frames. The background is always the same. The aim is to identify the position of the object, if possible without the shadows, but not so urgent to remove shadows. I've tried the opencv algorithm for background subtraction and thresholding, but this depends on only one image as a background, what if the background changes a little bit in brightness (or camera auto-focus), I need the algorithm to be strong for little changes as brightness or some shadows.

Robust method for tracking are part of broad research interests that are being developed all around the world...
Here are maybe keys to solve your problem that is very interesting but wide and open.
First a lot of them assumes brightness constancy (therefore what you ask is difficult to achieve). For instance:
Lucas-Kanade
Horn-Schunk
Block-matching
is widely used for tracking but assumes brightness constancy.
Then other interesting ones could be meanshift or camshift tracking, but you need a projection to follow... However you can use a back-projection computed accordingly to certain threshold to fit your needs for robustness...
I'll post later about that,
Julien,

When you try the thresholding in OpenCV are you doing this with RGB (red,green,blue) or HSV (hue,saturation,value) colour formats? From personal experience, I find the HSV encoding to be far superior for tracking coloured objects in video footage when used in conjunction with OpenCV for thresholding and cvBlobsLib for identifying the blob location.
HSV is easier since HSV has the advantage of only having to use a single number to detect the colour (“hue”), in spite of the very real probability of there being several shades of that colour, ranging from light to darker shades. (The amount of colour and the brightness of the colour are handled by the “saturation” and “value” parameters respectively).
I threshold the HSV reference image ('imgHSV') to obtain a binary (black and white) image using a call to the cvInRange() OpenCV API:
cvInRangeS( imgHSV,
cvScalar( 104, 178, 70 ),
cvScalar( 130, 240, 124 ),
imgThresh );
In the above example, the two cvScalar parameters are lower and upper bounds of HSV values that represents hues that are blueish in colour. In my own experiments I was able to obtain some suitable max/min values by grabbing screenshots of the object(s) I was interested in tracking and observing the kinds of hue/saturation/lum values that occur.
More detailed descriptions with a code sample can be found on this blog posting.

Andrian has a cool tutorial http://www.pyimagesearch.com/2015/05/25/basic-motion-detection-and-tracking-with-python-and-opencv/
I followed and have an good experiment test
https://youtu.be/HJBOOZVefXA
I use static image as well
frameDelta = cv2.absdiff(firstFrame, gray)
thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.dilate(thresh, None, iterations=2)
(cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
4 lines code find motion well
good luck

Related

GPUImage Histogram Equalization

I would like to use GPUImage's Histogram Equalization filter (link to .h) (link to .m) for a camera app. I'd like to use it in real time and present it as an option to be applied on the live camera feed. I understand this may be an expensive operation and cause some latency.
I'm confused about how this filter works. When selected in GPUImage's example project (Filter Showcase) the filter shows a very dark image that is biased toward red and blue which does not seem to be the way equalization should work.
Also what is the difference between the histogram types kGPUImageHistogramLuminance and kGPUImageHistogramRGB? Filter Showcase uses kGPUImageHistogramLuminance but the default in the init is kGPUImageHistogramRGB. If I switch Filter Showcase to kGPUImageHistogramRGB, I just get a black screen. My goal is an overall contrast optimization.
Does anyone have experience using this filter? Or are there current limitations with this filter that are documented somewhere?
Histogram equalization of RGB images is done using the Luminance as equalizing the RGB channels separately would render the colour information useless.
You basically convert RGB to a colour space that separates colour from intensity information. Then equalize the intensity image and finally reconvert it to RGB.
According to the documentation: http://oss.io/p/BradLarson/GPUImage
GPUImageHistogramFilter: This analyzes the incoming image and creates
an output histogram with the frequency at which each color value
occurs. The output of this filter is a 3-pixel-high, 256-pixel-wide
image with the center (vertical) pixels containing pixels that
correspond to the frequency at which various color values occurred.
Each color value occupies one of the 256 width positions, from 0 on
the left to 255 on the right. This histogram can be generated for
individual color channels (kGPUImageHistogramRed,
kGPUImageHistogramGreen, kGPUImageHistogramBlue), the luminance of the
image (kGPUImageHistogramLuminance), or for all three color channels
at once (kGPUImageHistogramRGB).
I'm not very familiar with the programming language used so I can't tell if the implementation is correct. But in the end, colours should not change too much. Pixels should just become brighter or darker.

Should I use HSV/HSB or RGB and why?

I have to detect leukocytes cells in an image that contains another blood cells, but the differences can be distinguished through the color of cells, leukocytes have more dense purple color, can be seen in the image below.
What color methode I've to use RGB/HSV ? and why ?!
sample image:
Usually when making decisions like this I just quickly plot the different channels and color spaces and see what I find. It is always better to start with a high quality image than to start with a low one and try to fix it with lots of processing
In this specific case I would use HSV. But unlike most color segmentation I would actually use the Saturation Channel to segment the images. The cells are nearly the same Hue so using the hue channel would be very difficult.
hue, (at full saturation and full brightness) very hard to differentiate cells
saturation huge contrast
Green channel, actually shows a lot of contrast as well (it surprised me)
the red and blue channels are hard to actually distinguish the cells.
Now that we have two candidate representations the saturation or the Green channel, we ask which is easier to work with? Since any HSV work involves us converting the RGB image, we can dismiss it, so the clear choice is to simply use the green channel of the RGB image for segmentation.
edit
since you didn't include a language tag I would like to attach some Matlab code I just wrote. It displays an image in all 4 color spaces so you can quickly make an informed decision on which to use. It mimics matlabs Color Thresholder colorspace selection window
function ViewColorSpaces(rgb_image)
% ViewColorSpaces(rgb_image)
% displays an RGB image in 4 different color spaces. RGB, HSV, YCbCr,CIELab
% each of the 3 channels are shown for each colorspace
% the display mimcs the New matlab color thresholder window
% http://www.mathworks.com/help/images/image-segmentation-using-the-color-thesholder-app.html
hsvim = rgb2hsv(rgb_image);
yuvim = rgb2ycbcr(rgb_image);
%cielab colorspace
cform = makecform('srgb2lab');
cieim = applycform(rgb_image,cform);
figure();
%rgb
subplot(3,4,1);imshow(rgb_image(:,:,1));title(sprintf('RGB Space\n\nred'))
subplot(3,4,5);imshow(rgb_image(:,:,2));title('green')
subplot(3,4,9);imshow(rgb_image(:,:,3));title('blue')
%hsv
subplot(3,4,2);imshow(hsvim(:,:,1));title(sprintf('HSV Space\n\nhue'))
subplot(3,4,6);imshow(hsvim(:,:,2));title('saturation')
subplot(3,4,10);imshow(hsvim(:,:,3));title('brightness')
%ycbcr / yuv
subplot(3,4,3);imshow(yuvim(:,:,1));title(sprintf('YCbCr Space\n\nLuminance'))
subplot(3,4,7);imshow(yuvim(:,:,2));title('blue difference')
subplot(3,4,11);imshow(yuvim(:,:,3));title('red difference')
%CIElab
subplot(3,4,4);imshow(cieim(:,:,1));title(sprintf('CIELab Space\n\nLightness'))
subplot(3,4,8);imshow(cieim(:,:,2));title('green red')
subplot(3,4,12);imshow(cieim(:,:,3));title('yellow blue')
end
you could call it like this
rgbim = imread('http://i.stack.imgur.com/gd62B.jpg');
ViewColorSpaces(rgbim)
and the display is this
in DIP and CV is this always a valid question
But it has no universal answer because each task is unique so use what is better suited for it. To choose correctly you need to know the pros/cons of each so here is some summary:
RGB
this is easy to handle and you can easyly access r,g,b bands. For many cases is better to check just single band instead of whole color or mix the colors to emphasize wanted feature or even dampening unwanted one. It is hard to compare colors in RGB due to intensity encoded into bands directly. To remedy that you can use normalization but that is slow (need per pixel sqrt). You can do arithmetics on RGB colors directly.
Example of task better suited for RGB:
finding horizont in high altitude photo
HSV
is better suited for color recognition because CV algorithms using HSV has very similar visual perception to human perception so if you want to recognize areas of distinct colors HSV is better. The conversion between RGB/HSV takes a bit of time which can be for big resolutions or hi fps apps a problem. For standard DIP/CV tasks is this usually not the case.
Example of task better suited for HSV:
Compare RGB colors
Take a look at:
HSV histogram
to see the distinct color separation in HSV. The segmentation of image based on color is easy on HSV. You can not do arithmetics on HSV colors directly instead need to convert to RGB and back

How to choose appropriate Scalar values when using InRange in OpenCV

I am trying to detect the white shapes in an object and can successfully do it for 1 video.
// Create and display a new matrix for triangles
triangles = src.clone();
GaussianBlur(triangles, triangles, Size(5, 5), 0, 0);
inRange(triangles, Scalar(150,150,150), Scalar(255, 255, 255), triangles);
imshow("triangles", triangles);
This gives me the result
http://s8.postimg.org/o9xg284jp/triangles.png
However, if I use a different video - then the scalar value of 150 may not be appropriate (for example if it is a light environment... everything gets detected)
http://s8.postimg.org/m09brgvlx/bad_triangles.png
For this video I would need to change the minimum scalar to be around 190-200 for it to work properly. My question - is there a good way to determine the correct scalar value to use? I know it sounds simple to some, but ive got a headache because of it!
http://colorizer.org/
If you check here you can see what your problem is. RGB = (255, 155, 155) is probably not a "white" but your inRange method is giving true output to that one.
Try to use HSL color space. Lightness > 90 is white for sure, no matter what H and S channel values are. Use BGR2HLS conversion. Then use inRange with L channel between 90-100.
Actually, for color detection problems, mostly used color spaces are HSV and HSL, not RGB!
There is probably no way to automatically determine a threshold that works for all kind of videos. But to make it less dependent on the overall lightning of the video you could make it depend on the mean or median pixel value of the image.
Or if you know how big your object appears in the image, you could choose the threshold accordingly.
Another approach could be to normalize the brightness of the video.
But which approach is best strongly dependents on your exact situation and requirements.

Fast image thresholding

What is a fast and reliable way to threshold images with possible blurring and non-uniform brightness?
Example (blurring but uniform brightness):
Because the image is not guaranteed to have uniform brightness, it's not feasible to use a fixed threshold. An adaptive threshold works alright, but because of the blurriness it creates breaks and distortions in the features (here, the important features are the Sudoku digits):
I've also tried using Histogram Equalization (using OpenCV's equalizeHist function). It increases contrast without reducing differences in brightness.
The best solution I've found is to divide the image by its morphological closing (credit to this post) to make the brightness uniform, then renormalize, then use a fixed threshold (using Otsu's algorithm to pick the optimal threshold level):
Here is code for this in OpenCV for Android:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));
Mat closed = new Mat(); // closed will have type CV_32F
Imgproc.morphologyEx(image, closed, Imgproc.MORPH_CLOSE, kernel);
Core.divide(image, closed, closed, 1, CvType.CV_32F);
Core.normalize(closed, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.threshold(image, image, -1, 255, Imgproc.THRESH_BINARY_INV
+Imgproc.THRESH_OTSU);
This works great but the closing operation is very slow. Reducing the size of the structuring element increases speed but reduces accuracy.
Edit: based on DCS's suggestion I tried using a high-pass filter. I chose the Laplacian filter, but I would expect similar results with Sobel and Scharr filters. The filter picks up high-frequency noise in the areas which do not contain features, and suffers from similar distortion to the adaptive threshold due to blurring. it also takes about as long as the closing operation. Here is an example with a 15x15 filter:
Edit 2: Based on AruniRC's answer, I used Canny edge detection on the image with the suggested parameters:
double mean = Core.mean(image).val[0];
Imgproc.Canny(image, image, 0.66*mean, 1.33*mean);
I'm not sure how to reliably automatically fine-tune the parameters to get connected digits.
Using Vaughn Cato and Theraot's suggestions, I scaled down the image before closing it, then scaled the closed image up to regular size. I also reduced the kernel size proportionately.
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(5,5));
Mat temp = new Mat();
Imgproc.resize(image, temp, new Size(image.cols()/4, image.rows()/4));
Imgproc.morphologyEx(temp, temp, Imgproc.MORPH_CLOSE, kernel);
Imgproc.resize(temp, temp, new Size(image.cols(), image.rows()));
Core.divide(image, temp, temp, 1, CvType.CV_32F); // temp will now have type CV_32F
Core.normalize(temp, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.threshold(image, image, -1, 255,
Imgproc.THRESH_BINARY_INV+Imgproc.THRESH_OTSU);
The image below shows the results side-by-side for 3 different methods:
Left - regular size closing (432 pixels), size 19 kernel
Middle - half-size closing (216 pixels), size 9 kernel
Right - quarter-size closing (108 pixels), size 5 kernel
The image quality deteriorates as the size of the image used for closing gets smaller, but the deterioration isn't significant enough to affect feature recognition algorithms. The speed increases slightly more than 16-fold for the quarter-size closing, even with the resizing, which suggests that closing time is roughly proportional to the number of pixels in the image.
Any suggestions on how to further improve upon this idea (either by further reducing the speed, or reducing the deterioration in image quality) are very welcome.
Alternative approach:
Assuming your intention is to have the numerals to be clearly binarized ... shift your focus to components instead of the whole image.
Here's a pretty easy approach:
Do a Canny edgemap on the image. First try it with parameters to Canny function in the range of the low threshold to 0.66*[mean value] and the high threshold to 1.33*[mean value]. (meaning the mean of the greylevel values).
You would need to fiddle with the parameters a bit to get an image where the major components/numerals are visible clearly as separate components. Near perfect would be good enough at this stage.
Considering each Canny edge as a connected component (i.e. use the cvFindContours() or its C++ counterpart, whichever) one can estimate the foreground and background greylevels and reach a threshold.
For the last bit, do take a look at sections 2. and 3. of this paper. Skipping most of the non-essential theoretical parts it shouldn't be too difficult to have it implemented in OpenCV.
Hope this helped!
Edit 1:
Based on the Canny edge thresholds here's a very rough idea just sufficient to fine-tune the values. The high_threshold controls how strong an edge must be before it is detected. Basically, an edge must have gradient magnitude greater than high_threshold to be detected in the first place. So this does the initial detection of edges.
Now, the low_threshold deals with connecting nearby edges. It controls how much nearby disconnected edges will get combined together into a single edge. For a better idea, read "Step 6" of this webpage. Try setting a very small low_threshold and see how things come about. You could discard that 0.66*[mean value] thing if it doesn't work on these images - its just a rule of thumb anyway.
We use Bradleys algorithm for very similar problem (to segment letters from background, with uneven light and uneven background color), described here: http://people.scs.carleton.ca:8008/~roth/iit-publications-iti/docs/gerh-50002.pdf, C# code here: http://code.google.com/p/aforge/source/browse/trunk/Sources/Imaging/Filters/Adaptive+Binarization/BradleyLocalThresholding.cs?r=1360. It works on integral image, which can be calculated using integral function of OpenCV. It is very reliable and fast, but itself is not implemented in OpenCV, but is easy to port.
Another option is adaptiveThreshold method in openCV, but we did not give it a try: http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#adaptivethreshold. The MEAN version is the same as bradleys, except that it uses a constant to modify the mean value instead of a percentage, which I think is better.
Also, good article is here: https://dsp.stackexchange.com/a/2504
You could try working on a per-tile basis if you know you have a good crop of the grid. Working on 9 subimages rather than the whole pic will most likely lead to more uniform brightness on each subimage. If your cropping is perfect you could even try going for each digit cell individually; but it all depends on how reliable is your crop.
Ellipse shape is complex to calculate if compared to a flat shape.
Try to change:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));
to:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(19,19));
can speed up your enough solution with low impact to accuracy.

Estimate Brightness of an image Opencv

I have been trying to obtain the image brightness in Opencv, and so far I have used calcHist and considered the average of the histogram values. However, I feel this is not accurate, as it does not actually determine the brightness of an image. I performed calcHist over a gray scale version of the image, and tried to differentiate between the avergae values obtained from bright images over that of moderate ones. I have not been successful so far. Could you please help me with a method or algorithm, that can be realised through OpenCv, to estimate brightness of an image? Thanks in advance.
I suppose, that HSV color model will be usefull in your problem, where channel V is Value:
"Value is the brightness of the color and varies with color saturation. It ranges from 0 to 100%. When the value is ’0′ the color space will be totally black. With the increase in the value, the color space brightness up and shows various colors."
So use OpenCV method cvCvtColor(const CvArr* src, CvArr* dst, int code), that converts an image from one color space to another. In your case code = CV_BGR2HSV.Than calculate histogram of third channel V.
I was about to ask the same, but then found out, that similar question gave no satisfactory answers. All answers I've found on SO deal with human observation of a single pixel RGB vs HSV.
From my observations, the subjective brightness of an image also depends strongly on the pattern. A star in a dark sky may look more bright than a cloudy sky by day, while the average pixel value of the first image will be much smaller.
The images I use are grey-scale cell-images produced by a microscope. The forms vary considerably. Sometimes they are small bright dots on very black background, sometimes less bright bigger areas on not so dark background.
My approach is:
Find histogram maximum (HMax) using threshold for removing hot pixels.
Calculate mean values of all pixel between HMax * 2/3 and HMax
The ratio 2/3 could be also increased to 3/4 (which reduces the range of pixels considered as bright).
The approach works quite well, as different cell-patterns with same titration produce similar brightness.
P.S.: What I actually wanted to ask is, whether there is a similar function for such a calculation in OpenCV or SimpleCV. Many thanks for any comments!
I prefer Valentin's answer, but for 'yet another' way of determining average-per-pixel brightness, you can use numpy and a geometric mean instead of arithmetic. To me it has better results.
from numpy.linalg import norm
def brightness(img):
if len(img.shape) == 3:
# Colored RGB or BGR (*Do Not* use HSV images with this function)
# create brightness with euclidean norm
return np.average(norm(img, axis=2)) / np.sqrt(3)
else:
# Grayscale
return np.average(img)
A bit of OpenCV C++ source code for a trivial check to differentiate between light and dark images. This is inspired by the answer above provided years ago by #ann-orlova:
const int darkness_threshold = 128; // you need to determine what threshold to use
cv::Mat mat = get_image_from_device();
cv::Mat hsv;
cv::cvtColor(mat, hsv, CV_BGR2HSV);
const auto result = cv::mean(hsv);
// cv::mean() will return 3 numbers, one for each channel:
// 0=hue
// 1=saturation
// 2=value (brightness)
if (result[2] < darkness_threshold)
{
process_dark_image(mat);
}
else
{
process_light_image(mat);
}

Resources