I have a background image of a plain surface.
My goals is to track objects that are positioned/moved over the surface.
I'm using MOG2 to find foreground objects with a learning rate of 0, so the background is not updated (otherwise a static object would be incorporated in the background).
The result is fine, but I have a huge problem with light: if the lighting changing after background acquiring, various artifacts are detected as foregorund objects.
How can I improve the robustness against lighting?
Update
I'm experimenting with a solution that works quite well, but it need some fixes.
I'm using MOG2 in this manner:
Acquiring and learning background using the first frames (BGK)
Apply MOG2 to current frame with learning rate of 0 (no update) and get foreground mask (FG_MASK)
For the next frames I'm using FG_MASK to mask BGK and I'm using the result to Apply to MOG2 with some learning rate (this update the background).
After that I'm updating BGK taking it from MOG2 algorythm.
In this way, objects are masked out of the background, and the background still updating. This can guarantee a good robustness against light changes.
There is some drawback, for example when the light is changing, the object mask ("mask blob") keep with the previous brightness, and if the difference is too high can be detected as new object.
In the above image you can see that the current frame is brighter and the mask for the static object is darker.
My idea is try to adapt the "mask blob" changing it's brightness following the light changing. How can i get this with OpenCV?
Fix for previous drawbacks
Using inpaint function instaead to simply mask the BGK (step 3) i can keep the "mask blobs" sync with background brightness changes.
This fix has drawback too, it's not very perfomance.
Update 2
I think this is an interesting topic so I keep it updated.
The inpaint function is very slow, so I'm trying another way.
Using the Hsv color space allows you to manage the brightness channel, I can reduce the impact of brightness in this way:
obtain the V channel with the Split function
calculate the mean value of channel V
apply a threshold truncate to the V channel using the mean value
Rebuild frame using new V channel
I had similar problem implementing speed estimation algorithm, I hope that my solution may help you.
One of the methods I tried was Accumulative Difference Image (basically what you did with MOG2), but it failed to track stationary objects when the background updated. When I did not update the background I had the same problem as you did.
So, I decided to use RGB/HSV thresholding. I set the boundaries for color of the road (let us say gray), and created binary image, where everything of a color of the road was black (0), everything else was white (1). Here is a nice tutorial on HSV threshold. When choosing boundaries you can acknowledge the lighting factor setting let us say upper boundary for bright lighting and lower for dark. However, this method may cause object of a color similar to background be not seen by the algorithm. Another shortcoming is the the background should be uniform, without any details.
Another method you can try is to convert both input image and background to grayscale and then subtract manually. This would give you an opportunity to tweak the threshold level for difference from the background. Let us say background of a value 120 in dark condition will have 140 in bright condition, so difference is 20. For object pixel may have let us say value of 180 and background value is 120, so difference is 60. Set threshold for difference 20 and set values below 20 to 0 and values above 20 to 1, this should do the thing (all values are on the scale from 0 to 255).
Good luck!
Related
I've got a video stream from the camera. My goal is to detect and track position of moving object (train).
First of all I tried to use movement detection (frames difference, background subtractors) but it gave bad results.
Tried to cling to the color of the object but it's often (bad lighting, blurry) the same color as a ground (railways).
So the current approach is to divide the area of movement of object into n small regions and define difference between the stored region when there's no object and current one.
The main problem in here is that lightness is changing and when I use a stored region from a reference frame (where there's no object) the brightness of the current frame might be different and it breaks it up.
Also brightness can change while object moving.
It helps me to apply a gaussianBlur and histogramEqualization to make sensitivity to changes in brightness a bit less.
I tried to compare a structure of according regions using ssim, lbph, hog. When I test lbph and hog approaches on the manual cropped regions which is larger than real ones it looked like working but then I used them for my small regions and it stopped working.
In the moment the most efficient approach is just difference between grayscale regions with rmse using a fixed thresholds but it's not robust approach and it suffers a lot when brightness is changing.
Now I'm trying to use some high-pass operator to extract the most dominant edges like with sobel operator in the attached figure but I'm not sure how to properly compare the high-passed regions except by finding the difference.
Frame with an empty railway:
In some seconds a train is appeared and luminance was changed.
At night time luminance is also different.
So the questions are:
What approaches are there for comparison of high-passed images?
Is there any other way to determine if an area is overlapped which you could suggest me?
I'm trying to reproduce Adobe Lightroom effect in my iOS application. Currently I'm using GPUImage for all effects, but found difficult to reproduce Highlights and Shadow effects. Also I already tried to use CIHighlightShadowAdjust from CIFilter but it gives me wrong result.
So I'm looking for at least algorithm which is used by Lightroom for this effects. They both are very similar to Brightness change, but seems like they are used to change only light/dark parts of picture, in depends if Highlights or Shadows was used.
Can anyone point in a right direction what I need to look to make same effects? How it is possible to change brightness only for dark/light part of picture?
Here is examples
1. Left is original image and right is an image with +100 highlights adjustment(possible range -100;100 with 0 as default).
You can see that sky(lighter part of image) has different brightness, but statue is almost has not changed.
Left is original image and right is an image with +100 shadows adjustment(possible range -100;100 with 0 as default).
Here you can see that statue(darker part of picture) has big changes in brightness, but sky remain almost without changes.
It looks like a nonlinear brightness transform has been applied. For example, the highlighting effect could mean that only the brighter parts of the image have even more increased brightness and the shadow adjustment could mean that the darker parts of the image have increased brightness.
The general approach would be
Transform RGB image data into a color space with the brightness as separate dimension, for example HSL or CIELAB.
Transform the brightness/lightness/luminance pixel-wise with a single transformation function that is continuous and monotonically increasing but bounded to the range of allowed values. This is akin to non-linearly stretching or compressing the brightness histogram.
Replace the original brightness with the transformed one.
Transform back into RGB color space.
A characteristic of the brightness transformation function is that it typically only stretches or compresses a certain brightness range (you show that nicely in the example images). This requires typically more than only a single parameter (you would need to define the range of the histogram that is affected as well as the strength). It looks like Adobe has some heuristics what it regards as shadows and what it regards as highlights (maybe the mean of the brightness histogram) as cut-off and only offers the strength as parameter.
The exact shape of the transformation is also up to your own taste. I played around a bit
Highlighting that looks similar to your hightlighting I can get with (transformed to CIELAB and Ligthness L goes from 0-100) a piecewise-linear function:
a = 1.5
b = 50
L(L>b)=a*L(L>b)-(a-1)*b
Shadow enhancement that looks similar to your shadow enhancement I can get with an exponentially decaying enhancement.
a = 4;
b = 20;
L = ((a-1)*exp(-L/b)+1) * L;
You see that I need always at least two parameters and I'm convinced that one could find better transformation functions, but the results strongly suggest that in essence it's nothing more than just a brightness transformation, which can be reproduced in an iOS app. Playing around with different transformation functions might give a better feeling of what is good and what is not.
I am trying to subtract two images using absdiff function ,to extract moving object, it works good but sometimes background appears in front of foreground.
This actually happens when the background and foreground colors are similar,Is there any solution to overcome this problem?
It may be description of the problem above not enough; so I attach images in the following
link .
Thanks..
You can use some pre-processing techniques like edge detection and some contrast stretching algorithm, which will give you some extra information for subtracting the image. Since color is same but new object should have texture feature like edge; if the edge gets preserved properly then when performing image subtraction you will obtain the object.
Process flow:
Use edge detection algorithm.
Contrast stretching algorithm(like histogram stretching).
Use the detected edge top of the contrast stretched image.
Now use the image subtraction algorithm from OpenCV.
There isn't enough information to formulate a complete solution to your problem but there are some tips I can offer:
First, prefilter the input and background images using a strong
median (or gaussian) filter. This will make your results much more
robust to image noise and confusion from minor, non-essential detail
(like the horizontal lines of your background image). Unless you want
to detect a single moving strand of hair, you don't need to process
the raw pixels.
Next, take the advice offered in the comments to test all 3 color
channels as opposed to going straight to grayscale.
Then create a grayscale image from the the max of the 3 absdiffs done
on each channel.
Then perform your closing and opening procedure.
I don't know your requirements so I can't take them into account. If accuracy is of the utmost importance. I'd use the median filter on input image over gaussian. If speed is an issue I'd scale down the input images for processing by at least half, then scale the result up again. If the camera is in a fixed position and you have a pre-calibrated background, then the current naive difference method should work. If the system has to determine movement from a real world environment over an extended period of time (moving shadows, plants, vehicles, weather, etc) then a rolling average (or gaussian) background model will work better. If the camera is moving you will need to do a lot more processing, probably some optical flow and/or fourier transform tests. All of these things need to be considered to provide the best solution for the application.
there are two images
alt text http://bbs.shoucangshidai.com/attachments/month_1001/1001211535bd7a644e95187acd.jpg
alt text http://bbs.shoucangshidai.com/attachments/month_1001/10012115357cfe13c148d3d8da.jpg
one is background image another one is a person's photo with the same background ,same size,what i want to do is remove the second image's background and distill the person's profile only. the common method is subtract first image from the second one,but my problem is if the color of person's wear is similar to the background. the result of subtract is awful. i can not get whole people's profile. who have good idea to remove the background give me some advice.
thank you in advance.
If you have a good estimate of the image background, subtracting it from the image with the person is a good first step. But it is only the first step. After that, you have to segment the image, i.e. you have to partition the image into "background" and "foreground" pixels, with constraints like these:
in the foreground areas, the average difference from the background image should be high
in the background areas, the average difference from the background image should be low
the areas should be smooth. Outline length and curvature should be minimal.
the borders of the areas should have a high contrast in the source image
If you are mathematically inclined, these constraints can be modeled perfectly with the Mumford-Shah functional. See here for more information.
But you can probably adapt other segmentation algorithms to the problem.
If you want a fast and simple (but not perfect) version, you could try this:
subtract the two images
find the largest consecutive "blob" of pixels with a background-foreground difference greater than some threshold. This is the first rough estimate for the "person area" in the foreground image, but the segmentation does not meet the criteria 3 and 4 above.
Find the outline of the largest blob (EDIT: Note that you don't have to start at the outline. You can also start with a larger polygon, as the steps will automatically shrink it to the optimal position.)
now go through each point in the outline and smooth the outline. i.e. for each point find the point that minimizes the formula: c1*L - c2*G, where L is the length of the outline polygon if the point were moved here and G is the gradient at the location the point would be moved to, c1/c2 are constants to control the process. Move the point to that position. This has the effect of smoothing the contour polygon in areas of low gradient in the source image, while keeping it tied to high gradients in the source image (i.e. the visible borders of the person). You can try different expressions for L and G, for example, L could take the length and curvature into account, and G could also take the gradient in the background and subtracted images into account.
you probably will have to re-normalize the outline polygon, i.e. make sure that the points on the outline are spaced regularly. Either that, or make sure that the distances between the points stay regular in the step before. ("Geodesic Snakes")
repeat the last two steps until convergence
You now have an outline polygon that touches the visible person-background border and continues smoothly where the border is not visible or has low contrast.
Look up "Snakes" (e.g. here) for more information.
Low-pass filter (blur) the images before you subtract them.
Then use that difference signal as a mask to select the pixels of interest.
A wide-enough filter will ignore the too-small (high-frequency) features that end up carving out "awful" regions inside your object of interest. It'll also reduce the highlighting of pixel-level noise and misalignment (the highest-frequency information).
In addition, if you have more than two frames, introducing some time hysteresis will let you form more stable regions of interest over time too.
One technique that I think is common is to use a mixture model. Grab a number of background frames and for each pixel build a mixture model for its color.
When you apply a frame with the person in it you will get some probability that the color is foreground or background, given the probability densities in the mixture model for each pixel.
After you have P(pixel is foreground) and P(pixel is background) you could just threshold the probability images.
Another possibility is to use the probabilities as inputs in some more clever segmentation algorithm. One example is graph cuts which I have noticed works quite well.
However, if the person is wearing clothes that are visually indistguishable from the background obviously none of the methods described above would work. You'd either have to get another sensor (like IR or UV) or have a quite elaborate "person model" which could "add" the legs in the right position if it finds what it thinks is a torso and head.
Good luck with the project!
Background vs Foreground detection is very subjective. The application scenario defines background or foreground. However in the application you detail, I guess you are implicitly saying that the person is the foreground.
Using the above assumption, what you seek is a person detection algorithm. A possible solution is:
Run a haar feature detector+ boosted cascade of weak classifiers
(see the opencv wiki for details)
Compute inter-frame motion (differences)
If there is a +ve face detection for a frame, cluster motion pixels
around the face (kNN algorithm)
voila... you should have a simple person detector.
Post the photo on Craigslist and tell them that you'll pay $5 for someone to do it.
Guaranteed you'll get hits in minutes.
Instead of a straight subtraction, you could step through both images, pixel by pixel, and only "subtract" the pixels which are exactly the same. That of course won't account for minor variances in colors, though.
I implemented some adaptive binarization methods, they use a small window and at each pixel the threshold value is calculated. There are problems with these methods:
If we select the window size too small we will get this effect (I think the reason is because of window size is small)
(source: piccy.info)
At the left upper corner there is an original image, right upper corner - global threshold result. Bottom left - example of dividing image to some parts (but I am talking about analyzing image's pixel small surrounding, for example window of size 10X10).
So you can see the result of such algorithms at the bottom right picture, we got a black area, but it must be white.
Does anybody know how to improve an algorithm to solve this problem?
There shpuld be quite a lot of research going on in this area, but unfortunately I have no good links to give.
An idea, which might work but I have not tested, is to try to estimate the lighting variations and then remove that before thresholding (which is a better term than "binarization").
The problem is then moved from adaptive thresholding to finding a good lighting model.
If you know anything about the light sources then you could of course build a model from that.
Otherwise a quick hack that might work is to apply a really heavy low pass filter to your image (blur it) and then use that as your lighting model. Then create a difference image between the original and the blurred version, and threshold that.
EDIT: After quick testing, it appears that my "quick hack" is not really going to work at all. After thinking about it I am not very surprised either :)
I = someImage
Ib = blur(I, 'a lot!')
Idiff = I - Idiff
It = threshold(Idiff, 'some global threshold')
EDIT 2
Got one other idea which could work depending on how your images are generated.
Try estimating the lighting model from the first few rows in the image:
Take the first N rows in the image
Create a mean row from the N collected rows. You know have one row as your background model.
For each row in the image subtract the background model row (the mean row).
Threshold the resulting image.
Unfortunately I am at home without any good tools to test this.
It looks like you're doing adaptive thresholding wrong. Your images look as if you divided your image into small blocks, calculated a threshold for each block and applied that threshold to the whole block. That would explain the "box" artifacts. Usually, adaptive thresholding means finding a threshold for each pixel separately, with a separate window centered around the pixel.
Another suggestion would be to build a global model for your lighting: In your sample image, I'm pretty sure you could fit a plane (in X/Y/Brightness space) to the image using least-squares, then separate the pixels into pixels brighter (foreground) and darker than that plane (background). You can then fit separate planes to the background and foreground pixels, threshold using the mean between these planes again and improve the segmentation iteratively. How well that would work in practice depends on how well your lightning can be modeled with a linear model.
If the actual objects you try to segment are "thinner" (you said something about barcodes in a comment), you could try a simple opening/closing operation the get a lighting model. (i.e. close the image to remove the foreground pixels, then use [closed image+X] as threshold).
Or, you could try mean-shift filtering to get the foreground and background pixels to the same brightness. (Personally, I'd try that one first)
You have very non-uniform illumination and fairly large object (thus, no universal easy way to extract the background and correct the non-uniformity). This basically means you can not use global thresholding at all, you need adaptive thresholding.
You want to try Niblack binarization. Matlab code is available here
http://www.uio.no/studier/emner/matnat/ifi/INF3300/h06/undervisningsmateriale/week-36-2006-solution.pdf (page 4).
There are two parameters you'll have to tune by hand: window size (N in the above code) and weight.
Try to apply a local adaptive threshold using this procedure:
convolve the image with a mean or median filter
subtract the original image from the convolved one
threshold the difference image
The local adaptive threshold method selects an individual threshold for each pixel.
I'm using this approach extensively and it's working fine with images having non uniform background.