OpenCV BackgroundSubtractor Thinks A Busy Foreground Is The Background - opencv

I am using background subtraction to identify and control items on a conveyor belt.
The work is very similar to the typical tracking cars on a highway example.
I start out with an empty conveyor belt so the computer knows the background in the beginning.
But after the conveyor has been full of packages for a while the computer starts to think that the packages are the background. This requires me to restart the program from time to time.
Does anyone have a workaround for this?
Since the conveyor belt is black I am wondering if it maybe makes sense to toss in a few blank frames from time to time or if perhaps there is some other solution.
Much Thanks
# After experimentation, these values seem to work best.
backSub = cv2.createBackgroundSubtractorMOG2(history = 5000, varThreshold = 1000, detectShadows = False)
while True:
return_value, frame = vid.read()
if frame is None:
print('Video has ended or failed, try a different video format!')
break
## [gaussian blur helps to remove noise These settings seem to remove reflections from rollers.]
blur = cv2.GaussianBlur(frame, (0,0), 5, 0, 0)
## [End of: gaussian blur helps to remove noise These settings seem to remove reflections from rollers.]
## [Remove the background and produce a grays scale image]
fgMask = backSub.apply(blur)
## [End of: Remove the background and produce a grey scale image]

You say you're giving the background subtractor some pure background initially.
When you're done with that and in the "running" phase, call apply() specifically with learningRate = 0. That ensures the model won't be updated.

Related

OpenCV background subtraction: How to precompute background model?

I am working on a tracking algorithm and one of the earliest steps it does is background subtraction. The algorithm gets a series of frames that represent the video with a moving object and static background. The object is in every frame.
In my first version of this process I computed a median image from all the frames and got a very good background scene approximation. Then I subtracted the resulting image from every frame in video sequence to get foreground (moving objects).
The above method worked well, but then I tried to replace it by using OpenCV's background subtractors MOG and MOG2.
What I do not understand is how these two classes perform the "precomputation of the background model"? As far as I understood from dozens of tutorials and documentations, these subtractors update the background model every time I use the apply() method and return a foreground mask.
But this means thet the first result of the apply() method will be a blank mask. And the later images wil have initial object's position ghost in it (see example below):
What am I missing? I googled a lot and seem to be the only one with this problem... Is there a way to run background precomputation that I am not aware of?
EDIT: I found a "trick" to do it: Before using OpenCV's MOG or MOG2 I first compute median background image, then I use it in first apply() call. The following apply() calls produce the foreground mask without the initial position ghost.
But still, is this how it should be done or is there a better way?
If your moving objects are present right from the start, all updating background estimators will place them in the background initially. A solution to that is to initialize your MOG on all frames and then run MOG again with this initialization (as with your median estimate). Depending on the number of frames you might want to adjust the update parameter of MOG (learningRate) to make sure its fully initialized (if you have 100 frames it probably needs to be higher at least 0.01):
void BackgroundSubtractorMOG::operator()(InputArray image, OutputArray fgmask, double **learningRate**=0)
If your moving objects are not present right from the start, make sure that MOG is fully initialized when they appear by setting a high enough value for the update parameter learningRate.

How show stereo camera with Oculus Rift?

I use the OpenCV for show in a new windows the left and right image from a stereo camera. Now I want to see the same thing on the Oculus Rift but when I connect the Oculus the image doesn't became in the Characteristic Circled image suitable with the lens inside the Oculus...
I need to process by myself the image ? It's not Automatic?
This is the code for show the windows:
cap >> frame; //cap= camera 1 & cap2=camera 2
cap.read(frame);
sz1 = frame.size();
//second camera
cap2 >> frame2;
cap2.read(frame2);
sz2 = frame2.size();
cv::Mat bothFrames(sz2.height, sz2.width + sz1.width, CV_8UC3);
// Move right boundary to the left.
bothFrames.adjustROI(0, 0, 0, -sz1.width);
frame2.copyTo(bothFrames);
// Move the left boundary to the right, right boundary to the right.
bothFrames.adjustROI(0, 0, -sz2.width, sz1.width);
frame.copyTo(bothFrames);
// restore original ROI.
bothFrames.adjustROI(0, 0, sz2.width, 0);
cv::imencode(".jpg", bothFrames, buf, params);
I have another problem. I'm trying to add the OVR Library to my code but I have the error "System Ambibuous Symbol" because some class inside the OVR Library used the same namaspace... This error arise when I add the
#include "OVR.h"
using namespace OVR;
-.-"
The SDK is meant to perform lens distortion correction, chromatic aberration correction (different refractive indices for different color light causes color fringing in image without correction), time warp, and possibly other corrections in the future. Unless you have a heavy weight graphics pipeline that you're hand optimizing, it's best to use the SDK rendering option.
You can learn about the SDK and different kinds of correction here:
http://static.oculusvr.com/sdk-downloads/documents/Oculus_SDK_Overview.pdf
It also explains how the distortion corrections are applied. The SDK is open source so you could also just read the source for a more thorough understanding.
To fix your namespace issue, just don't switch to the OVR namespace! Every time you refer to something from the OVR namespace, prefix it with OVR:: - e.g, OVR::Math - this is, after all, the whole point of namespaces :p

EmguCV Shape Detection affected by Image Size

I'm using the Emgu shape detection example application to detect rectangles on a given image. The dimensions of the resized image appear to impact the number of shapes detected even though the aspect ratio remains the same. Here's what I mean:
Using (400,400), actual img size == 342,400
Using (520,520), actual img size == 445,520
Why is this so? And how can the optimal value be determined?
Thanks
I replied to your post on EMGU but figured you haven't checked back but this is it. The shape detection works on the principle of thresh-holding unlikely matches, this prevents lots of false classifications. This is true for many image processing algorithms. Basically there are no perfect setting and a designer must select the most appropriate settings to produce the most desirable results. I.E. match the most objects without saying there's more than there actually is.
You will need to adjust each variable individually to see what kind of results you get. Start of with the edge detection.
Image<Gray, Byte> cannyEdges = gray.Canny(cannyThreshold, cannyThresholdLinking);
Have a look at your smaller image see what the difference is between the rectangles detected and the one that isn't. You could be missing and edge or a corner which is why it's not classified. If you are adjust cannyThreshold and observe the results, if good then keep it :) if bad :( go back to the original value. Once satisfied adjust cannyThresholdLinking and observe.
You will keep repeating this until you get a preferred image the advantage here is that you have 3 items to compare you will continue until the item that's not being recognised matches the other two.
If they are the similar, likely as it is a black and white image you'll need to go onto the Hough lines detection.
LineSegment2D[] lines = cannyEdges.HoughLinesBinary(
1, //Distance resolution in pixel-related units
Math.PI / 45.0, //Angle resolution measured in radians.
20, //threshold
30, //min Line width
10 //gap between lines
)[0]; //Get the lines from the first channel
Use the same method of adjusting one value at a time and observing the output you will hopefully find the settings you need. Never jump in with both feet and change all the values as you will never know if your improving the accuracy or not. Finally if all else fails look at the section that inspects the Hough results for a rectangle
if (angle < 80 || angle > 100)
{
isRectangle = false;
break;
}
Less variables to change as hough should do all the work for you. but still it could all work out here.
I'm sorry that there is no straight forward answer, but I hope you keep at it and solve the problem. Else you could always resize the image each time.
Cheers
Chris

iOS: Smooth button Glow effect by blending between images

I am creating a custom button that needs to be able to glow to a varying degree
How would I use these pictures to make a button that 'glows' the diamond when it is pressed, and have this glow gradually fade back to inert state?
I want to churn out several different colours of diamond as well... I am hoping to generate all different coloured diamonds from the same stock images presented here.
I would like to get my head around the basic methods available, in enough detail that I can see each one through and make a decision which path to take...
My tangled efforts so far... ( I will delete all of this, or move it into possibly several answers as a solution unfolds... )
I can see 3 potential solution paths:
GL
it looks as though GL has everything it takes to get complete fine-grained control over the process, although functions exposed by core graphics come tantalisingly close, and that would save several hundred lines of code spread over a bunch of source files, which seems a bit ridiculous for such a basic task.
core graphics, and core animation to accomplish the blending
documentation goes on to say
Anything underneath the unpainted samples, such as the current fill color or other drawing, shows through.
so I can chroma-key mask the left image, setting {0,0,0} ie Black as the key.
this at least secures a transparent background, now I have to work on making it yellow instead of grey.
so maybe I could have started instead with setting a yellow back colour for my image context, then use some CGContextSetBlendMode(...) to imprint the diamond on the yellow, THEN use chroma-key masking to get a transparent background
ok, this covers at least getting the basic unlit image on-screen
now I could overlay the sparkly image, using some blend mode, maybe I could keep it in its current greyscale state, and that would just boost the colours of the original
only problem with this is that it is a lot of heavy real-time blending
so maybe I could pre-calculate every image in the animation... this is looking increasingly mucky...
Cocos2D
if this allows me to set the blend mode to additive blending then I could just composite the glowing image over the original image with an appropriate Alpha setting.
After digging through a lot of documentation, the optimal solution seems to be to use core graphics functions to get the source images into a single 2-component GL texture, and then use GL to blend between them.
I will need to pass a uniform value glow_factor into the shader
The obvious solution might seem to simply use
r,g,b = in_r,g,b * { (1 - glow_factor) * inertPixel + glow_factor * shinyPixel }
(where inertPixel is the appropriate pixel of the inert diamond etc)...
it looks like I would also do well to manufacture my own sparkles and add them over the top; a gem should sparkle white irrespective of its characteristic colour.
After having looked at this problem a little more, I can see several solutions
Solution A -- store the transition from glow=0 to glow=1 as 60 frames in memory, then load the appropriate frame into a GL texture every time it is required.
this has an obvious benefit that a graphic designer could construct the entire sequence and I could load it in as a bunch of PNG files.
another advantage is that these frames wouldn't need to be played in sequence... the appropriate frame can be chosen on-the-fly
however, it has a potential drawback of a lot of sending data RAM->VRAM
this can be optimised by using glTexSubImage2D; several frames can be sent simultaneously and then unpacked from within GL... in fact maybe the entire sequence. if this is so, then it would make sense to use PVRT texture compression.
iOS: playing a frame-by-frame greyscale animation in a custom colour
Solution B -- load glow=0 and glow=1 images as GL textures, and manually write shader code that takes in the glow factor as a uniform and performs the blend
this has an advantage that it is close to the wire and can be tweaked in all sorts of ways. Also it is going to be very efficient. This advantage is that it is a big extra slice of code to maintain.
Solution C -- set glBlendMode to perform additive blending.
then draw the glow=0 image image, setting eg alpha=0.2 on each vertex.
then draw the glow=1 image image, setting eg alpha=0.8 on each vertex.
this has an advantage that it can be achieved with a more generic code structure -- ie a very general ' draw textured quad / sprite ' class.
disadvantage is that without some sort of wrapper it is a bit messy... in my game I have a couple of dozen diamonds -- at any one time maybe 2 or 3 are likely to be glowing. so first-pass I would render EVERYTHING ( just need to set Alpha appropriately for everything that is glowing ) and then on the second pass I could draw the glowing sprite again with appropriate Alpha for everything that IS glowing.
it is worth noting that if I pursue solution A, this would involve creating some sort of real-time movie player object, which could be a very useful reusable code component.

What processing steps should I use to clean photos of line drawings?

My usual method of 100% contrast and some brightness adjusting to tweak the cutoff point usually works reasonably well to clean up photos of small sub-circuits or equations for posting on E&R.SE, however sometimes it's not quite that great, like with this image:
What other methods besides contrast (or instead of) can I use to give me a more consistent output?
I'm expecting a fairly general answer, but I'll probably implement it in a script (that I can just dump files into) using ImageMagick and/or PIL (Python) so if you have anything specific to them it would be welcome.
Ideally a better source image would be nice, but I occasionally use this on other folk's images to add some polish.
The first step is to equalize the illumination differences in the image while taking into account the white balance issues. The theory here is that the brightest part of the image within a limited area represents white. By blurring the image beforehand we eliminate the influence of noise in the image.
from PIL import Image
from PIL import ImageFilter
im = Image.open(r'c:\temp\temp.png')
white = im.filter(ImageFilter.BLUR).filter(ImageFilter.MaxFilter(15))
The next step is to create a grey-scale image from the RGB input. By scaling to the white point we correct for white balance issues. By taking the max of R,G,B we de-emphasize any color that isn't a pure grey such as the blue lines of the grid. The first line of code presented here is a dummy, to create an image of the correct size and format.
grey = im.convert('L')
width,height = im.size
impix = im.load()
whitepix = white.load()
greypix = grey.load()
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 * impix[x,y][0] / whitepix[x,y][0], 255 * impix[x,y][1] / whitepix[x,y][1], 255 * impix[x,y][2] / whitepix[x,y][2]))
The result of these operations is an image that has mostly consistent values and can be converted to black and white via a simple threshold.
Edit: It's nice to see a little competition. nikie has proposed a very similar approach, using subtraction instead of scaling to remove the variations in the white level. My method increases the contrast in the regions with poor lighting, and nikie's method does not - which method you prefer will depend on whether there is information in the poorly lighted areas which you wish to retain.
My attempt to recreate this approach resulted in this:
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 + impix[x,y][0] - whitepix[x,y][0], 255 + impix[x,y][1] - whitepix[x,y][1], 255 + impix[x,y][2] - whitepix[x,y][2]))
I'm working on a combination of techniques to deliver an even better result, but it's not quite ready yet.
One common way to remove the different background illumination is to calculate a "white image" from the image, by opening the image.
In this sample Octave code, I've used the blue channel of the image, because the lines in the background are least prominent in this channel (EDITED: using a circular structuring element produces less visual artifacts than a simple box):
src = imread('lines.png');
blue = src(:,:,3);
mask = fspecial("disk",10);
opened = imerode(imdilate(blue,mask),mask);
Result:
Then subtract this from the source image:
background_subtracted = opened-blue;
(contrast enhanced version)
Finally, I'd just binarize the image with a fixed threshold:
binary = background_subtracted < 35;
How about detecting edges? That should pick up the line drawings.
Here's the result of Sobel edge detection on your image:
If you then threshold the image (using either an empirically determined threshold or the Ohtsu method), you can clean up the image using morphological operations (e.g. dilation and erosion). That will help you get rid of broken/double lines.
As Lambert pointed out, you can pre-process the image using the blue channel to get rid of the grid lines if you don't want them in your result.
You will also get better results if you light the page evenly before you image it (or just use a scanner) cause then you don't have to worry about global vs. local thresholding as much.

Resources