here's my problem: I'm trying to create a simple program which adds Gaussian noise to an input image. The only constraints are that the input image is of type CV_64F (i.e. double) and the values are and must be kept normalized between 0 and 1.
The code I wrote is the following:
Mat my_noise;
my_ noise = Mat (input.size(), input.type());
randn(noise, 0, 5); //mean and variance
input += noise;
The above code doesn't work, the resulting image doesn't get displayed properly. I think that happens because it gets out of the 0,1 range. I modified the code like this:
Mat my_noise;
my_ noise = Mat (input.size(), input.type());
randn(noise, 0, 5); //mean and variance
input += noise;
normalize(input, input, 0.0, 1.0, CV_MINMAX, CV_64F);
but it still doesn't work. Again, the resulting image doesn't get displayed properly. Where is the problem? Remember: the input image is of type CV_64F and the values are normalized between 0 and 1 before adding noise and have to remain like also after the noise addition.
Thank you in advance.
Your problem is that Gaussian noise can have arbitrary amplitude and can't be represented in [0, 1]. Renormalizing after adding the noise is a mistake, because just one large noise value could affect the whole image.
Probably what you need to do is saturate the image when adding the noise, values that would be greater than 1.0 are clamped to 1.0, and values that would be less than 0.0 are clamped to 0.0.
Something like
cv::Mat noise(input.size(), input.type());
cv::randn(noise, 0, 5); //mean and variance
input += noise;
cv::Mat clamp_1 = cv::Mat::ones(input.size(), input.type());
cv::Mat clamp_0 = cv::Mat::zeros(input.size(), input.type());
input = cv::max(input, clamp_0);
input = cv::min(input, clamp_1);
Also a noise variance of 5 is very large, it means that there is about a 92% chance that the input + noise will be outside the range [0, 1], assuming the input is uniformly distributed on [0, 1]. So your saturated image will be mostly black and white, with the input image having little effect on the result.
Related
In OpenCV how do you calculate the average gradient strength in a Mat and the average gradient direction?
I have sourced the below methods by googling but I want to confirm I am actually doing this correctly before moving onto the next step.
Is this correct?
Mat img = imread('foo.png', CV_8UC); // read image as grayscale single channel
// Calculate the mean intensity and the std deviation
// Any errors here or am I doing this correctly?
Scalar sMean, sStdDev;
meanStdDev(src, sMean, sStdDev);
double mean = sMean[0];
double stddev = sStdDev[0];
// Calculate the average gradient magnitude/strength across the image
// Any errors here or am I doing this correctly?
Mat dX, dY, magnitude;
Sobel(src, dX, CV_32F, 1, 0, 1);
Sobel(src, dY, CV_32F, 0, 1, 1);
magnitude(dX, dY, magnitude);
Scalar sMMean, sMStdDev;
meanStdDev(magnitude, sMMean, sMStdDev);
double magnitudeMean = sMMean[0];
double magnitudeStdDev = sMStdDev[0];
// Calculate the average gradient direction across the image
// Any errors here or am I doing this correctly?
Scalar avgHorizDir = mean(dX);
Scalar avgVertDir = mean(dY);
double avgDir = atan2(-avgVertDir[0], avgHorizDir[0]);
float blurriness = cv::videostab::calcBlurriness(src); // low values = sharper. High values = blurry
Technically those are the correct ways of obtaining the two averages.
The way you compute mean direction uses weighted directional statistics, meaning that pixels without a strong gradient have less influence on the average.
However, for most images this average direction is not very meaningful, as there exist edges in all directions and cancel out.
If your image is of a single edge, then this will work great.
If your image has lines in it, containing edges in opposite directions, this will not work. In this case, you want to average the double angle (average orientations). The obvious way of doing this is to compute the direction per pixel as an angle, double them, then use directional statistics to average (ie convert back to vectors and average those). Doubling the angle causes opposite directions to be mapped to the same value, thus averaging doesn’t cancel these out.
Another simple way to average orientations is to take the average of the tensor field obtained by the outer product of the gradient field with itself, and determine the direction of the eigenvector corresponding to the largest eigenvalue. The tensor field is obtained as follows:
Mat Sxx = dX * dX;
Mat Syy = dY * dY;
Mat Sxy = dX * dY;
This should then be averaged:
Scalar mSxx = mean(sXX);
Scalar mSyy = mean(sYY);
Scalar mSxy = mean(sXY);
These values form a 2x2 real-valued symmetric matrix:
| mSxx mSxy |
| mSxy mSyy |
It is relatively straight-forward to determine its eigendecomposition, and can be done analytically. I don’t have the equations on hand right now, so I’ll leave it as an exercise to the reader. :)
Currently I am trying to extract the hieroglyphics symbols from images like this one.
What I have done is used hough transform to find lines and split the image in portions to make it easier for me. But I tried a set of algorithms to extract the sunken letters from the image and I hit a dead end..
What I have tried is a mixture of morphological operations and edge detection and contour finding.
So are there any algorithms devised to do something like this or any hint will be appreciated.
You can up-sample the input image, apply some smoothing, and find the Otsu threshold, then use this threshold to find Canny edges with different window sizes.
For the larger window (5 x 5), you get a noisy image that contains almost all the edges you need, plus noise.
For the smaller window (3 x 3), you get a less noisy image, but some of the edges are missing.
If this less noisy image is not good enough, you can try morphologically reconstructing it using the noisy image as the mask. Here, I've linked some diagonal edge segments in the noisy image using a morphological hit-miss transform and then applied the reconstruction.
Using a
Mat k = (Mat_<int>(3, 3) <<
0, 0, 1,
0, -1, 0,
1, 0, 0);
kernel for linking broken edges, you get a thinner outline.
Please note that in the c++ code below, I've used a naive reconstruction.
Mat im = imread("rsSUY.png", 0);
/* up sample and smooth */
pyrUp(im, im);
GaussianBlur(im, im, Size(5, 5), 5);
/* find the Otsu threshold */
Mat bw1, bw2;
double th = threshold(im, bw1, 0, 255, THRESH_BINARY | THRESH_OTSU);
/* use the found Otsu threshold for Canny */
Canny(im, bw1, th, th/2, 5, true); /* this result would be noisy */
Canny(im, bw2, th, th/2, 3, true); /* this result would be less noisy */
/* link broken edges in more noisy image using hit-miss transform */
Mat k = (Mat_<int>(3, 3) <<
0, 0, 1,
0, -1, 0,
0, 0, 0);
Mat hitmiss;
morphologyEx(bw1, hitmiss, MORPH_HITMISS, k);
bw1 |= hitmiss;
/* apply morphological reconstruction to less noisy image using the modified noisy image */
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
double prevMu = 0;
Mat recons = bw2.clone();
for (int i = 0; i < 200; i++)
{
dilate(recons, recons, kernel);
recons &= bw1;
Scalar mu = mean(recons);
if (abs(mu.val[0] - prevMu) < 0.001)
{
break;
}
prevMu = mu.val[0];
}
imshow("less noisy", bw2);
imshow("reconstructed", recons);
waitKey();
The best bet for this task is machine learning. You can:
Crop or mark a few samples for each letter
Train an SSD (Single-shot Multibox Detector) using these samples
The advantage is that you will be able to detect all letters in an image in one pass.
I have a rather simple but not so perfect solution.
1. Finding the optimal higher and lower threshold based on the median of the green channel of the image
Upper threshold image:
Lower threshold image:
2. Subtracting the two images followed by median filtering:
3. Canny edge detection:
To get a better finish you need to follow this up by some morphological operations.
I want to detect the very minimal movement of a conveyor belt using image evaluation (Resolution: 31x512, image rate: 1000 per second.). The moment of belt-start is important for me.
If I do cv::absdiff between two subsequent images, I obtain very noisy result:
According to the mechanical rotation sensor of the motor, the movement starts here:
I tried to threshold the abs-diff image with a cascade of erosion and dilation, but I could detect the earliest change more than second too late in this image:
Is it possible to find the change earlier?
Here is the sequence of the Images without changes (according to motor sensor):
In this sequence the movement begins in the middle image:
Looks like I've found a solution which works in MY case.
Instead of comparing the image changes in space-domain, the cross-correlation should be applied:
I convert both images to DFT, multiply DFT-Mats and convert back. The max pixel value is the center of the correlation. As long as the images are same, the max-pix remains in the same position and moves otherwise.
The actual working code uses 3 images, 2 DFT multiplication result between images 1,2 and 2,3:
Mat img1_( 512, 32, CV_16UC1 );
Mat img2_( 512, 32, CV_16UC1 );
Mat img3_( 512, 32, CV_16UC1 );
//read the data in the images wohever you want. I read from MHD-file
//Set ROI (if required)
Mat img1 = img1_(cv::Rect(0,200,32,100));
Mat img2 = img2_(cv::Rect(0,200,32,100));
Mat img3 = img3_(cv::Rect(0,200,32,100));
//Float mats for DFT
Mat img1f;
Mat img2f;
Mat img3f;
//DFT and produtcts mats
Mat dft1,dft2,dft3,dftproduct,dftproduct2;
//Calculate DFT of both images
img1.convertTo(img1f, CV_32FC1);
cv::dft(img1f, dft1);
img2.convertTo(img3f, CV_32FC1);
cv::dft(img3f, dft3);
img3.convertTo(img2f, CV_32FC1);
cv::dft(img2f, dft2);
//Multiply DFT Mats
cv::mulSpectrums(dft1,dft2,dftproduct,true);
cv::mulSpectrums(dft2,dft3,dftproduct2,true);
//Convert back to space domain
cv::Mat result,result2;
cv::idft(dftproduct,result);
cv::idft(dftproduct2,result2);
//Not sure if required, I needed it for visualizing
cv::normalize( result, result, 0, 255, NORM_MINMAX, CV_8UC1);
cv::normalize( result2, result2, 0, 255, NORM_MINMAX, CV_8UC1);
//Find maxima positions
double dummy;
Point locdummy; Point maxLoc1; Point maxLoc2;
cv::minMaxLoc(result, &dummy, &dummy, &locdummy, &maxLoc1);
cv::minMaxLoc(result2, &dummy, &dummy, &locdummy, &maxLoc2);
//Calculate products simply fot having one value to compare
int maxlocProd1 = maxLoc1.x*maxLoc1.y;
int maxlocProd2 = maxLoc2.x*maxLoc2.y;
//Calculate absolute difference of the products. Not 0 means movement
int absPosDiff = std::abs(maxlocProd2-maxlocProd1);
if ( absPosDiff>0 )
{
std::cout << id<< std::endl;
break;
}
I am trying to substract background from depth images acquired with kinect. When I learned what otsu thresholding is I thought that it could with it. Converting the depth image to grayscale i can hopefully apply otsu threshold to binarize the image.
However I implemented (tried to implemented) this with OpenCV 2.3, it came in vain. The output image is binarized however, very unexpectedly. I did the thresholding continuously (i.e print the result to screen to analyze for each frame) and saw that for some frames threshold is found to be 160ish and sometimes it is found to be 0. I couldn't quite understand why this is happening. May it be due to the high number of 0's in the depth image returned by kinect, which corresponds to pixels that can not be measured. Is there a way that I could tell the algorithm to ignore pixels having the value 0? Or otsu thresholding is not good for what I am trying to do?
Here are some outputs and segment of the related code. You may notice that the second screenshot looks like it could do some good binarization, however i want to achieve one that distincly differentiates between pixels corresponding to the chair in the scene and the backgroung.
Thanks.
cv::Mat1s depthcv(depth->getHeight(), depth->getWidth());
cv::Mat1b depthcv8(depth->getHeight(), depth->getWidth());
cv::Mat1b depthcv8_th(depth->getHeight(), depth->getWidth());
depthcv.data =(uchar*) depth->getDepthMetaData().Data();
depthcv.convertTo(depthcv8,CV_8U,255/5000.f);
//apply otsu thresholding
cv::threshold(depthcv8, depthcv8_th, 128, 255, CV_THRESH_BINARY|CV_THRESH_OTSU);
std::ofstream output;
output.open("output.txt");
//output << "M = "<< endl << " " << depthcv8 << endl << endl;
cv::imshow("lab",depthcv8_th);
cv::waitKey(1);
Otsu is probably good enough for what you are trying to do, but you do need to mask out the zero values before computing the optimal threshold with the Otsu algorithm, otherwise the distribution of intensity values will be skewed lower than what you want.
OpenCV does not provide a mask argument for the cv::threshold function, so you will have to remove those values yourself. I would recommend putting all the non-zero values in a 1 by N matrix, and calling the cv::threshold function with CV_THRESH_OTSU and saving the return value (which is the estimated optimal threshold), and then running the cv::threshold function again on the original image with just the CV_THRESH_BINARY flag and the computed threshold.
Here is one possible implementation:
// move zeros to the back of a temp array
cv::Mat copyImg = origImg;
uint8* ptr = copyImg.datastart;
uint8* ptr_end = copyImg.dataend;
while (ptr < ptr_end) {
if (*ptr == 0) { // swap if zero
uint8 tmp = *ptr_end;
*ptr_end = *ptr;
*ptr = tmp;
ptr_end--; // make array smaller
} else {
ptr++;
}
}
// make a new matrix with only valid data
cv::Mat nz = cv::Mat(std::vector<uint8>(copyImg.datastart,ptr_end),true);
// compute optimal Otsu threshold
double thresh = cv::threshold(nz,nz,0,255,CV_THRESH_BINARY | CV_THRESH_OTSU);
// apply threshold
cv::threshold(origImg,origImg,thresh,255,CV_THRESH_BINARY_INV);
i'm using openNI for some project with kinect sensor. i'd like to color the users pixels given with the depth map. now i have pixels that goes from white to black, but i want from red to black. i've tried with alpha blending, but my result is only that i have pixels from pink to black because i add (with addWeight) red+white = pink.
this is my actual code:
layers = device.getDepth().clone();
cvtColor(layers, layers, CV_GRAY2BGR);
Mat red = Mat(240,320, CV_8UC3, Scalar(255,0,0));
Mat red_body; // = Mat::zeros(240,320, CV_8UC3);
red.copyTo(red_body, device.getUserMask());
addWeighted(red_body, 0.8, layers, 0.5, 0.0, layers);
where device.getDepth() returns a cv::Mat with depth map and device.getUserMask() returns a cv::Mat with user pixels (only white pixels)
some advice?
EDIT:
one more thing:
thanks to sammy answer i've done it. but actually i don't have values exactly from 0 to 255, but from (for example) 123-220.
i'm going to find minimum and maximum via a simple for loop (are there better way?), and how can i map my values from min-max to 0-255 ?
First, OpenCV's default color format is BGR not RGB. So, your code for creating the red image should be
Mat red = Mat(240,320, CV_8UC3, Scalar(0,0,255));
For red to black color map, you can use element wise multiplication instead of alpha blending
Mat out = red_body.mul(layers, 1.0/255);
You can find the min and max values of a matrix M using
double minVal, maxVal;
minMaxLoc(M, &minVal, &maxVal, 0, 0);
You can then subtract the minValue and scale with a factor
double factor = 255.0/(maxVal - minVal);
M = factor*(M -minValue)
Kinda clumsy and slow, but maybe split layers, copy red_body (make it a one channel Mat, not 3) to the red channel, merge them back into layers?
Get the same effect, but much faster (in place) with reshape:
layers = device.getDepth().clone();
cvtColor(layers, layers, CV_GRAY2BGR);
Mat red = Mat(240,320, CV_8UC1, Scalar(255)); // One channel
Mat red_body;
red.copyTo(red_body, device.getUserMask());
Mat flatLayer = layers.reshape(1,240*320); // presumed dimensions of layer
red_body.reshape(0,240*320).copyTo(flatLayer.col(0));
// layers now has the red from red_body