I want to detect the very minimal movement of a conveyor belt using image evaluation (Resolution: 31x512, image rate: 1000 per second.). The moment of belt-start is important for me.
If I do cv::absdiff between two subsequent images, I obtain very noisy result:
According to the mechanical rotation sensor of the motor, the movement starts here:
I tried to threshold the abs-diff image with a cascade of erosion and dilation, but I could detect the earliest change more than second too late in this image:
Is it possible to find the change earlier?
Here is the sequence of the Images without changes (according to motor sensor):
In this sequence the movement begins in the middle image:
Looks like I've found a solution which works in MY case.
Instead of comparing the image changes in space-domain, the cross-correlation should be applied:
I convert both images to DFT, multiply DFT-Mats and convert back. The max pixel value is the center of the correlation. As long as the images are same, the max-pix remains in the same position and moves otherwise.
The actual working code uses 3 images, 2 DFT multiplication result between images 1,2 and 2,3:
Mat img1_( 512, 32, CV_16UC1 );
Mat img2_( 512, 32, CV_16UC1 );
Mat img3_( 512, 32, CV_16UC1 );
//read the data in the images wohever you want. I read from MHD-file
//Set ROI (if required)
Mat img1 = img1_(cv::Rect(0,200,32,100));
Mat img2 = img2_(cv::Rect(0,200,32,100));
Mat img3 = img3_(cv::Rect(0,200,32,100));
//Float mats for DFT
Mat img1f;
Mat img2f;
Mat img3f;
//DFT and produtcts mats
Mat dft1,dft2,dft3,dftproduct,dftproduct2;
//Calculate DFT of both images
img1.convertTo(img1f, CV_32FC1);
cv::dft(img1f, dft1);
img2.convertTo(img3f, CV_32FC1);
cv::dft(img3f, dft3);
img3.convertTo(img2f, CV_32FC1);
cv::dft(img2f, dft2);
//Multiply DFT Mats
cv::mulSpectrums(dft1,dft2,dftproduct,true);
cv::mulSpectrums(dft2,dft3,dftproduct2,true);
//Convert back to space domain
cv::Mat result,result2;
cv::idft(dftproduct,result);
cv::idft(dftproduct2,result2);
//Not sure if required, I needed it for visualizing
cv::normalize( result, result, 0, 255, NORM_MINMAX, CV_8UC1);
cv::normalize( result2, result2, 0, 255, NORM_MINMAX, CV_8UC1);
//Find maxima positions
double dummy;
Point locdummy; Point maxLoc1; Point maxLoc2;
cv::minMaxLoc(result, &dummy, &dummy, &locdummy, &maxLoc1);
cv::minMaxLoc(result2, &dummy, &dummy, &locdummy, &maxLoc2);
//Calculate products simply fot having one value to compare
int maxlocProd1 = maxLoc1.x*maxLoc1.y;
int maxlocProd2 = maxLoc2.x*maxLoc2.y;
//Calculate absolute difference of the products. Not 0 means movement
int absPosDiff = std::abs(maxlocProd2-maxlocProd1);
if ( absPosDiff>0 )
{
std::cout << id<< std::endl;
break;
}
We're currently trying to detect the object regions in medical instruments images using the methods available in OpenCV, C++ version. An example image is shown below:
Here are the steps we're following:
Converting the image to gray scale
Applying median filter
Find edges using sobel filter
Convert the result to binary image using a threshold of 25
Skeletonize the image to make sure we have neat edges
Finding X largest connected components
This approach works perfectly for the image 1 and here is the result:
The yellow borders are the connected components detected.
The rectangles are just to highlight the presence of a connected component.
To get understandable results, we just removed the connected components that are completely inside any another one, so the end result is something like this:
So far, everything was fine but another sample of image complicated our work shown below.
Having a small light green towel under the objects results this image:
After filtering the regions as we did earlier, we got this:
Obviously, it is not what we need..we're excepting something like this:
I'm thinking about clustering the closest connected components found(somehow!!) so we can minimize the impact of the presence of the towel, but don't know yet if it's something doable or someone has tried something like this before? Also, does anyone have any better idea to overcome this kind of problems?
Thanks in advance.
Here's what I tried.
In the images, the background is mostly greenish and the area of the background is considerably larger than that of the foreground. So, if you take a color histogram of the image, the greenish bins will have higher values. Threshold this histogram so that bins having smaller values are set to zero. This way we'll most probably retain the greenish (higher value) bins and discard other colors. Then backproject this histogram. The backprojection will highlight these greenish regions in the image.
Backprojection:
Then threshold this backprojection. This gives us the background.
Background (after some morphological filtering):
Invert the background to get foreground.
Foreground (after some morphological filtering):
Then find the contours of the foreground.
I think this gives a reasonable segmentation, and using this as mask you may be able to use a segmentation like GrabCut to refine the boundaries (I haven't tried this yet).
EDIT:
I tried the GrabCut approach and it indeed refines the boundaries. I've added the code for GrabCut segmentation.
Contours:
GrabCut segmentation using the foreground as mask:
I'm using the OpenCV C API for the histogram processing part.
// load the color image
IplImage* im = cvLoadImage("bFly6.jpg");
// get the color histogram
IplImage* im32f = cvCreateImage(cvGetSize(im), IPL_DEPTH_32F, 3);
cvConvertScale(im, im32f);
int channels[] = {0, 1, 2};
int histSize[] = {32, 32, 32};
float rgbRange[] = {0, 256};
float* ranges[] = {rgbRange, rgbRange, rgbRange};
CvHistogram* hist = cvCreateHist(3, histSize, CV_HIST_ARRAY, ranges);
IplImage* b = cvCreateImage(cvGetSize(im32f), IPL_DEPTH_32F, 1);
IplImage* g = cvCreateImage(cvGetSize(im32f), IPL_DEPTH_32F, 1);
IplImage* r = cvCreateImage(cvGetSize(im32f), IPL_DEPTH_32F, 1);
IplImage* backproject32f = cvCreateImage(cvGetSize(im), IPL_DEPTH_32F, 1);
IplImage* backproject8u = cvCreateImage(cvGetSize(im), IPL_DEPTH_8U, 1);
IplImage* bw = cvCreateImage(cvGetSize(im), IPL_DEPTH_8U, 1);
IplConvKernel* kernel = cvCreateStructuringElementEx(3, 3, 1, 1, MORPH_ELLIPSE);
cvSplit(im32f, b, g, r, NULL);
IplImage* planes[] = {b, g, r};
cvCalcHist(planes, hist);
// find min and max values of histogram bins
float minval, maxval;
cvGetMinMaxHistValue(hist, &minval, &maxval);
// threshold the histogram. this sets the bin values that are below the threshold to zero
cvThreshHist(hist, maxval/32);
// backproject the thresholded histogram. backprojection should contain higher values for the
// background and lower values for the foreground
cvCalcBackProject(planes, backproject32f, hist);
// convert to 8u type
double min, max;
cvMinMaxLoc(backproject32f, &min, &max);
cvConvertScale(backproject32f, backproject8u, 255.0 / max);
// threshold backprojected image. this gives us the background
cvThreshold(backproject8u, bw, 10, 255, CV_THRESH_BINARY);
// some morphology on background
cvDilate(bw, bw, kernel, 1);
cvMorphologyEx(bw, bw, NULL, kernel, MORPH_CLOSE, 2);
// get the foreground
cvSubRS(bw, cvScalar(255, 255, 255), bw);
cvMorphologyEx(bw, bw, NULL, kernel, MORPH_OPEN, 2);
cvErode(bw, bw, kernel, 1);
// find contours of the foreground
//CvMemStorage* storage = cvCreateMemStorage(0);
//CvSeq* contours = 0;
//cvFindContours(bw, storage, &contours);
//cvDrawContours(im, contours, CV_RGB(255, 0, 0), CV_RGB(0, 0, 255), 1, 2);
// grabcut
Mat color(im);
Mat fg(bw);
Mat mask(bw->height, bw->width, CV_8U);
mask.setTo(GC_PR_BGD);
mask.setTo(GC_PR_FGD, fg);
Mat bgdModel, fgdModel;
grabCut(color, mask, Rect(), bgdModel, fgdModel, GC_INIT_WITH_MASK);
Mat gcfg = mask == GC_PR_FGD;
vector<vector<cv::Point>> contours;
vector<Vec4i> hierarchy;
findContours(gcfg, contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE, cv::Point(0, 0));
for(int idx = 0; idx < contours.size(); idx++)
{
drawContours(color, contours, idx, Scalar(0, 0, 255), 2);
}
// cleanup ...
UPDATE: We can do the above using the C++ interface as shown below.
const int channels[] = {0, 1, 2};
const int histSize[] = {32, 32, 32};
const float rgbRange[] = {0, 256};
const float* ranges[] = {rgbRange, rgbRange, rgbRange};
Mat hist;
Mat im32fc3, backpr32f, backpr8u, backprBw, kernel;
Mat im = imread("bFly6.jpg");
im.convertTo(im32fc3, CV_32FC3);
calcHist(&im32fc3, 1, channels, Mat(), hist, 3, histSize, ranges, true, false);
calcBackProject(&im32fc3, 1, channels, hist, backpr32f, ranges);
double minval, maxval;
minMaxIdx(backpr32f, &minval, &maxval);
threshold(backpr32f, backpr32f, maxval/32, 255, THRESH_TOZERO);
backpr32f.convertTo(backpr8u, CV_8U, 255.0/maxval);
threshold(backpr8u, backprBw, 10, 255, THRESH_BINARY);
kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
dilate(backprBw, backprBw, kernel);
morphologyEx(backprBw, backprBw, MORPH_CLOSE, kernel, Point(-1, -1), 2);
backprBw = 255 - backprBw;
morphologyEx(backprBw, backprBw, MORPH_OPEN, kernel, Point(-1, -1), 2);
erode(backprBw, backprBw, kernel);
Mat mask(backpr8u.rows, backpr8u.cols, CV_8U);
mask.setTo(GC_PR_BGD);
mask.setTo(GC_PR_FGD, backprBw);
Mat bgdModel, fgdModel;
grabCut(im, mask, Rect(), bgdModel, fgdModel, GC_INIT_WITH_MASK);
Mat fg = mask == GC_PR_FGD;
I would consider a few options. My assumption is that the camera does not move. I haven't used the images or written any code, so this is mostly from experience.
Rather than just looking for edges, try separating the background using a segmentation algorithm. Mixture of Gaussian can help with this. Given a set of images over the same region (i.e. video), you can cancel out regions which are persistent. Then, new items such as instruments will pop out. Connected components can then be used on the blobs.
I would look at segmentation algorithms to see if you can optimize the conditions to make this work for you. One major item is to make sure your camera is stable or you stabilize the images yourself pre-processing.
I would consider using interest points to identify regions in the image with a lot of new material. Given that the background is relatively plain, small objects such as needles will create a bunch of interest points. The towel should be much more sparse. Perhaps overlaying the detected interest points over the connected component footprint will give you a "density" metric which you can then threshold. If the connected component has a large ratio of interest points for the area of the item, then it is an interesting object.
On this note, you can even clean up the connected component footprint by using a Convex Hull to prune the objects you have detected. This may help situations such as a medical instrument casting a shadow on the towel which stretches the component region. This is a guess, but interest points can definitely give you more information than just edges.
Finally, given that you have a stable background with clear objects in view, I would take a look at Bag-of-Features to see if you can just detect each individual object in the image. This may be useful since there seems to be a consistent pattern to the objects in these images. You can build a big database of images such as needles, gauze, scissors, etc. Then BoF, which is in OpenCV will find those candidates for you. You can also mix it in with other operations you are doing to compare results.
Bag of Features using OpenCV
http://www.codeproject.com/Articles/619039/Bag-of-Features-Descriptor-on-SIFT-Features-with-O
-
I would also suggest an idea to your initial version. You can also skip the contours, whose regions have width and height greater than the half the image width and height.
//take the rect of the contours
Rect rect = Imgproc.boundingRect(contours.get(i));
if (rect.width < inputImageWidth / 2 && rect.height < inputImageHeight / 2)
//then continue to draw or use for next purposes.
here's my problem: I'm trying to create a simple program which adds Gaussian noise to an input image. The only constraints are that the input image is of type CV_64F (i.e. double) and the values are and must be kept normalized between 0 and 1.
The code I wrote is the following:
Mat my_noise;
my_ noise = Mat (input.size(), input.type());
randn(noise, 0, 5); //mean and variance
input += noise;
The above code doesn't work, the resulting image doesn't get displayed properly. I think that happens because it gets out of the 0,1 range. I modified the code like this:
Mat my_noise;
my_ noise = Mat (input.size(), input.type());
randn(noise, 0, 5); //mean and variance
input += noise;
normalize(input, input, 0.0, 1.0, CV_MINMAX, CV_64F);
but it still doesn't work. Again, the resulting image doesn't get displayed properly. Where is the problem? Remember: the input image is of type CV_64F and the values are normalized between 0 and 1 before adding noise and have to remain like also after the noise addition.
Thank you in advance.
Your problem is that Gaussian noise can have arbitrary amplitude and can't be represented in [0, 1]. Renormalizing after adding the noise is a mistake, because just one large noise value could affect the whole image.
Probably what you need to do is saturate the image when adding the noise, values that would be greater than 1.0 are clamped to 1.0, and values that would be less than 0.0 are clamped to 0.0.
Something like
cv::Mat noise(input.size(), input.type());
cv::randn(noise, 0, 5); //mean and variance
input += noise;
cv::Mat clamp_1 = cv::Mat::ones(input.size(), input.type());
cv::Mat clamp_0 = cv::Mat::zeros(input.size(), input.type());
input = cv::max(input, clamp_0);
input = cv::min(input, clamp_1);
Also a noise variance of 5 is very large, it means that there is about a 92% chance that the input + noise will be outside the range [0, 1], assuming the input is uniformly distributed on [0, 1]. So your saturated image will be mostly black and white, with the input image having little effect on the result.
I am trying to substract background from depth images acquired with kinect. When I learned what otsu thresholding is I thought that it could with it. Converting the depth image to grayscale i can hopefully apply otsu threshold to binarize the image.
However I implemented (tried to implemented) this with OpenCV 2.3, it came in vain. The output image is binarized however, very unexpectedly. I did the thresholding continuously (i.e print the result to screen to analyze for each frame) and saw that for some frames threshold is found to be 160ish and sometimes it is found to be 0. I couldn't quite understand why this is happening. May it be due to the high number of 0's in the depth image returned by kinect, which corresponds to pixels that can not be measured. Is there a way that I could tell the algorithm to ignore pixels having the value 0? Or otsu thresholding is not good for what I am trying to do?
Here are some outputs and segment of the related code. You may notice that the second screenshot looks like it could do some good binarization, however i want to achieve one that distincly differentiates between pixels corresponding to the chair in the scene and the backgroung.
Thanks.
cv::Mat1s depthcv(depth->getHeight(), depth->getWidth());
cv::Mat1b depthcv8(depth->getHeight(), depth->getWidth());
cv::Mat1b depthcv8_th(depth->getHeight(), depth->getWidth());
depthcv.data =(uchar*) depth->getDepthMetaData().Data();
depthcv.convertTo(depthcv8,CV_8U,255/5000.f);
//apply otsu thresholding
cv::threshold(depthcv8, depthcv8_th, 128, 255, CV_THRESH_BINARY|CV_THRESH_OTSU);
std::ofstream output;
output.open("output.txt");
//output << "M = "<< endl << " " << depthcv8 << endl << endl;
cv::imshow("lab",depthcv8_th);
cv::waitKey(1);
Otsu is probably good enough for what you are trying to do, but you do need to mask out the zero values before computing the optimal threshold with the Otsu algorithm, otherwise the distribution of intensity values will be skewed lower than what you want.
OpenCV does not provide a mask argument for the cv::threshold function, so you will have to remove those values yourself. I would recommend putting all the non-zero values in a 1 by N matrix, and calling the cv::threshold function with CV_THRESH_OTSU and saving the return value (which is the estimated optimal threshold), and then running the cv::threshold function again on the original image with just the CV_THRESH_BINARY flag and the computed threshold.
Here is one possible implementation:
// move zeros to the back of a temp array
cv::Mat copyImg = origImg;
uint8* ptr = copyImg.datastart;
uint8* ptr_end = copyImg.dataend;
while (ptr < ptr_end) {
if (*ptr == 0) { // swap if zero
uint8 tmp = *ptr_end;
*ptr_end = *ptr;
*ptr = tmp;
ptr_end--; // make array smaller
} else {
ptr++;
}
}
// make a new matrix with only valid data
cv::Mat nz = cv::Mat(std::vector<uint8>(copyImg.datastart,ptr_end),true);
// compute optimal Otsu threshold
double thresh = cv::threshold(nz,nz,0,255,CV_THRESH_BINARY | CV_THRESH_OTSU);
// apply threshold
cv::threshold(origImg,origImg,thresh,255,CV_THRESH_BINARY_INV);