I am trying to implement difference of guassians (DoG), for a specific case of edge detection. As the name of the algorithm suggests, it is actually fairly straightforward:
Mat g1, g2, result;
Mat img = imread("test.png", CV_LOAD_IMAGE_COLOR);
GaussianBlur(img, g1, Size(1,1), 0);
GaussianBlur(img, g2, Size(3,3), 0);
result = g1 - g2;
However, I have the feeling that this can be done more efficiently. Can it perhaps be done in less passes over the data?
The question here has taught me about separable filters, but I'm too much of an image processing newbie to understand how to apply them in this case.
Can anyone give me some pointers on how one could optimise this?

Separable filters work in the same way as normal gaussian filters. The separable filters are faster than normal Gaussian when the image size is large. The filter kernel can be formed analytically and the filter can be separated into two 1 dimensional vectors, one horizontal and one vertical.
for example..
consider the filter to be
1 2 1
2 4 2
1 2 1
this filter can be separated into horizontal vector (H) 1 2 1 and vertical vector(V) 1 2 1. Now these sets of two filters are applied to the image. Vector H is applied to the horizontal pixels and V to the vertical pixels. The results are then added together to get the Gaussian Blur. I'm providing a function that does the separable Gaussian Blur. (Please dont ask me about the comments, I'm too lazy :P)
Mat sepConv(Mat input, int radius)
Mat sep;
Mat dst,dst2;
int ksize = 2 *radius +1;
double sigma = radius / 2.575;
Mat gau = getGaussianKernel(ksize, sigma,CV_32FC1);
Mat newgau = Mat(gau.rows,1,gau.type());
filter2D(input, dst2, -1, newgau);
filter2D(dst2.t(), dst, -1, newgau);
return dst.t();
One more method to improve the calculation of Gaussian Blur is to use FFT. FFT based convolution is much faster than the separable kernel method, if the data size is pretty huge.
A quick google search provided me with the following function
Mat Conv2ByFFT(Mat A,Mat B)
Mat C;
// reallocate the output array if needed
C.create(abs(A.rows - B.rows)+1, abs(A.cols - B.cols)+1, A.type());
Size dftSize;
// compute the size of DFT transform
dftSize.width = getOptimalDFTSize(A.cols + B.cols - 1);
dftSize.height = getOptimalDFTSize(A.rows + B.rows - 1);
// allocate temporary buffers and initialize them with 0's
Mat tempA(dftSize, A.type(), Scalar::all(0));
Mat tempB(dftSize, B.type(), Scalar::all(0));
// copy A and B to the top-left corners of tempA and tempB, respectively
Mat roiA(tempA, Rect(0,0,A.cols,A.rows));
Mat roiB(tempB, Rect(0,0,B.cols,B.rows));
// now transform the padded A & B in-place;
// use "nonzeroRows" hint for faster processing
Mat Ax = computeDFT(tempA);
Mat Bx = computeDFT(tempB);
// multiply the spectrums;
// the function handles packed spectrum representations well
mulSpectrums(Ax, Bx, Ax,0,true);
// transform the product back from the frequency domain.
// Even though all the result rows will be non-zero,
// we need only the first C.rows of them, and thus we
// pass nonzeroRows == C.rows
//dft(Ax, Ax, DFT_INVERSE + DFT_SCALE, C.rows);
Mat Cx = updateResult(Ax);
//idft(tempA, tempA, DFT_SCALE, A.rows + B.rows - 1 );
// now copy the result back to C.
Cx(Rect(0, 0, C.cols, C.rows)).copyTo(C);
//C.convertTo(C, CV_8UC1);
// all the temporary buffers will be deallocated automatically
return C;
Hope this helps. :)

I know this post is old. But the question is interresting and may interrest future readers. As far as I know, a DoG filter is not separable. So there is two solutions left:
1) compute both convolutions by calling the function GaussianBlur() twice then subtract the two images
2) Make a kernel by computing the difference of two gaussian kernels then convolve it with the image.
About which solution is faster:
The solution 2 seems faster at first sight because it convolves the image only once.
But this does not involve a separable filter. On the contrary, the first solution involves two separable filter and may be faster finaly. (I do not know how the OpenCV function GaussianBlur() is optimised and whether it uses separable filters or not. But it is likely.)
However, if one uses FFT technique to convolve, the second solution is surely faster.
If anyone has any advice to add or wishes to correct me, please do.


Calculating sharpness of an image

I found on the internet that laplacian method is quite good technique to compute the sharpness of a image. I was trying to implement it in opencv 2.4.10. How can I get the sharpness measure after applying the Laplacian function? Below is the code:
Mat src_gray, dst;
int kernel_size = 3;
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
/// Convert the image to grayscale
cvtColor( src, src_gray, CV_RGB2GRAY );
/// Apply Laplace function
Mat abs_dst;
Laplacian( src_gray, dst, ddepth, kernel_size, scale, delta, BORDER_DEFAULT );
//compute sharpness
Can someone please guide me on this?
Possible duplicate of: Is there a way to detect if an image is blurry?
so your focus measure is:
cv::Laplacian(src_gray, dst, CV_64F);
cv::Scalar mu, sigma;
cv::meanStdDev(dst, mu, sigma);
double focusMeasure = sigma.val[0] * sigma.val[0];
Edit #1:
Okay, so a well focused image is expected to have sharper edges, so the use of image gradients are instrumental in order to determine a reliable focus measure. Given an image gradient, the focus measure pools the data at each point as an unique value.
The use of second derivatives is one technique for passing the high spatial frequencies, which are associated with sharp edges. As a second derivative operator we use the Laplacian operator, that is approximated using the mask:
To pool the data at each point, we use two methods. The first one is the sum of all the absolute values, driving to the following focus measure:
where L(m, n) is the convolution of the input image I(m, n) with the mask L. The second method calculates the variance of the absolute values, providing a new focus measure given by:
where L overline is the mean of absolute values.
Read the article
J.L. Pech-Pacheco, G. Cristobal, J. Chamorro-Martinez, J.
Fernandez-Valdivia, "Diatom autofocusing in brightfield microscopy: a
comparative study", 15th International Conference on Pattern
Recognition, 2000. (Volume:3 )
for more information.
Not exactly the answer, but I got a formula using an intuitive approach that worked on the wild.
I'm currently working in a script to detect multiple faces in a picture with a crowd, using mtcnn , which it worked very well, however it also detected many faces so blurry that you couldn't say it was properly a face.
Example image:
Faces detected:
Matrix of detected faces:
mtcnn detected about 123 faces, however many of them had little resemblance as a face. In fact, many faces look more like a stain than anything else...
So I was looking a way of 'filtering' those blurry faces. I tried the Laplacian filter and FFT way of filtering I found on this answer , however I had inconsistent results and poor filtering results.
I turned my research in computer vision topics, and finally tried to implement an 'intuitive' way of filtering using the following principle:
When more blurry is an image, less 'edges' we have
If we compare a crisp image with a blurred version of the same image, the results tends to 'soften' any edges or adjacent contrasting regions. Based on that principle, I was finding a way of weighting edges and then a simple way of 'measuring' the results to get a confidence value.
I took advantage of Canny detection in OpenCV and then apply a mean value of the result (Python):
def getBlurValue(image):
canny = cv2.Canny(image, 50,250)
return np.mean(canny)
Canny return 2x2 array same image size . I selected threshold 50,250 but it can be changed depending of your image and scenario.
Then I got the average value of the canny result, (definitively a formula to be improved if you know what you're doing).
When an image is blurred the result will get a value tending to zero, while crisp image tend to be a positive value, higher when crisper is the image.
This value depend on the images and threshold, so it is not a universal solution for every scenario, however a best value can be achieved normalizing the result and averaging all the faces (I need more work on that subject).
In the example, the values are in the range 0-27.
I averaged all faces and I got about a 3.7 value of blur
If I filter images above 3.7:
So I kept with mosth crisp faces:
That consistently gave me better results than the other tests.
Ok, you got me. This is a tricky way of detecting a blurriness values inside the same image space. But I hope people can take advantage of this findings and apply what I learned in its own projects.

iOS & OpenCV: Image Registration / Alignment

I am doing a project of combining multiple images similar to HDR in iOS. I have managed to get 3 images of different exposures through the Camera and now I want to align them because during the capture, one's hand must have shaken and resulted in all 3 images having slightly different alignment.
I have imported OpenCV framework and I have been exploring functions in OpenCV to align/register images, but found nothing. Is there actually a function in OpenCV to achieve this? If not, is there any other alternatives?
In OpenCV 3.0 you can use findTransformECC. I have copied this ECC Image Alignment code from where a very similar problem is solved for aligning color channels. The post also contains code in Python. Hope this helps.
// Read the images to be aligned
Mat im1 = imread("images/image1.jpg");
Mat im2 = imread("images/image2.jpg");
// Convert images to gray scale;
Mat im1_gray, im2_gray;
cvtColor(im1, im1_gray, CV_BGR2GRAY);
cvtColor(im2, im2_gray, CV_BGR2GRAY);
// Define the motion model
const int warp_mode = MOTION_EUCLIDEAN;
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
Mat warp_matrix;
// Initialize the matrix to identity
if ( warp_mode == MOTION_HOMOGRAPHY )
warp_matrix = Mat::eye(3, 3, CV_32F);
warp_matrix = Mat::eye(2, 3, CV_32F);
// Specify the number of iterations.
int number_of_iterations = 5000;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-10;
// Define termination criteria
TermCriteria criteria (TermCriteria::COUNT+TermCriteria::EPS, number_of_iterations, termination_eps);
// Run the ECC algorithm. The results are stored in warp_matrix.
// Storage for warped image.
Mat im2_aligned;
if (warp_mode != MOTION_HOMOGRAPHY)
// Use warpAffine for Translation, Euclidean and Affine
warpAffine(im2, im2_aligned, warp_matrix, im1.size(), INTER_LINEAR + WARP_INVERSE_MAP);
// Use warpPerspective for Homography
warpPerspective (im2, im2_aligned, warp_matrix, im1.size(),INTER_LINEAR + WARP_INVERSE_MAP);
// Show final result
imshow("Image 1", im1);
imshow("Image 2", im2);
imshow("Image 2 Aligned", im2_aligned);
There is no single function called something like align, you need to do/implement it yourself, or find an already implemented one.
Here is a one solution.
You need to extract keypoints from all 3 images and try to match them. Be sure that your keypoint extraction technique is invariant to illumination changes since all have different intensity values because of different exposures. You need to match your keypoints and find some disparity. Then you can use disparity to align your images.
Remember this answer is so superficial, for details first you need to do some research about keypoint/descriptor extraction, and keypoint/descriptor matching.
Good luck!

Threshold of blurry image - part 2

How can I threshold this blurry image to make the digits as clear as possible?
In a previous post, I tried adaptively thresholding a blurry image (left), which resulted in distorted and disconnected digits (right):
Since then, I've tried using a morphological closing operation as described in this post to make the brightness of the image uniform:
If I adaptively threshold this image, I don't get significantly better results. However, because the brightness is approximately uniform, I can now use an ordinary threshold:
This is a lot better than before, but I have two problems:
I had to manually choose the threshold value. Although the closing operation results in uniform brightness, the level of brightness might be different for other images.
Different parts of the image would do better with slight variations in the threshold level. For instance, the 9 and 7 in the top left come out partially faded and should have a lower threshold, while some of the 6s have fused into 8s and should have a higher threshold.
I thought that going back to an adaptive threshold, but with a very large block size (1/9th of the image) would solve both problems. Instead, I end up with a weird "halo effect" where the centre of the image is a lot brighter, but the edges are about the same as the normally-thresholded image:
Edit: remi suggested morphologically opening the thresholded image at the top right of this post. This doesn't work too well. Using elliptical kernels, only a 3x3 is small enough to avoid obliterating the image entirely, and even then there are significant breakages in the digits:
Edit2: mmgp suggested using a Wiener filter to remove blur. I adapted this code for Wiener filtering in OpenCV to OpenCV4Android, but it makes the image even blurrier! Here's the image before (left) and after filtering with my code and a 5x5 kernel:
Here is my adapted code, which filters in-place:
private void wiener(Mat input, int nRows, int nCols) { // I tried nRows=5 and nCols=5
Mat localMean = new Mat(input.rows(), input.cols(), input.type());
Mat temp = new Mat(input.rows(), input.cols(), input.type());
Mat temp2 = new Mat(input.rows(), input.cols(), input.type());
// Create the kernel for convolution: a constant matrix with nRows rows
// and nCols cols, normalized so that the sum of the pixels is 1.
Mat kernel = new Mat(nRows, nCols, CvType.CV_32F, new Scalar(1.0 / (double) (nRows * nCols)));
// Get the local mean of the input. localMean = convolution(input, kernel)
Imgproc.filter2D(input, localMean, -1, kernel, new Point(nCols/2, nRows/2), 0);
// Get the local variance of the input. localVariance = convolution(input^2, kernel) - localMean^2
Core.multiply(input, input, temp); // temp = input^2
Imgproc.filter2D(temp, temp, -1, kernel, new Point(nCols/2, nRows/2), 0); // temp = convolution(input^2, kernel)
Core.multiply(localMean, localMean, temp2); //temp2 = localMean^2
Core.subtract(temp, temp2, temp); // temp = localVariance = convolution(input^2, kernel) - localMean^2
// Estimate the noise as mean(localVariance)
Scalar noise = Core.mean(temp);
// Compute the result. result = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.max(temp, noise, temp2); // temp2 = max(localVariance, noise)
Core.subtract(temp, noise, temp); // temp = localVariance - noise
Core.max(temp, new Scalar(0), temp); // temp = max(0, localVariance - noise)
Core.divide(temp, temp2, temp); // temp = max(0, localVar-noise) / max(localVariance, noise)
Core.subtract(input, localMean, input); // input = input - localMean
Core.multiply(temp, input, input); // input = max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.add(input, localMean, input); // input = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Some hints that you might try out:
Apply the morphological opening in your original thresholded image (the one which is noisy at the right of the first picture). You should get rid of most of the background noise and be able to reconnect the digits.
Use a different preprocessing of your original image instead of morpho closing, such as median filter (tends to blur the edges) or bilateral filtering which will preserve better the edges but is slower to compute.
As far as threshold is concerned, you can use CV_OTSU flag in the cv::threshold to determine an optimal value for a global threshold. Local thresholding might still be better, but should work better with the bilateral or median filter
I've tried thresholding each 3x3 box separately, using Otsu's algorithm (CV_OTSU - thanks remi!) to determine an optimal threshold value for each box. This works a bit better than thresholding the entire image, and is probably a bit more robust.
Better solutions are welcome, though.
If you're willing to spend some cycles on it there are de-blurring techniques that could be used to sharpen up the picture prior to processing. Nothing in OpenCV yet but if this is a make-or-break kind of thing you could add it.
There's a bunch of literature on the subject:
And some chatter on the OpenCV mailing list:
The weird "halo effect" that you're seeing is likely due to OpenCV assuming black for the color when the adaptive threshold is at/near the edge of the image and the window that it's using "hangs over" the edge into non-image territory. There are ways to correct for this, most likely you would make an temporary image that's at least two full block-sizes taller and wider than the image from the camera. Then copy the camera image into the middle of it. Then set the surrounding "blank" portion of the temp image to be the average color of the image from the camera. Now when you perform the adaptive threshold the data at/near the edges will be much closer to accurate. It won't be perfect since its not a real picture but it will yield better results than the black that OpenCV is assuming is there.
My proposal assumes you can identify the sudoku cells, which I think, is not asking too much. Trying to apply morphological operators (although I really like them) and/or binarization methods as a first step is the wrong way here, in my opinion of course. Your image is at least partially blurry, for whatever reason (original camera angle and/or movement, among other reasons). So what you need is to revert that, by performing a deconvolution. Of course asking for a perfect deconvolution is too much, but we can try some things.
One of these "things" is the Wiener filter, and in Matlab, for instance, the function is named deconvwnr. I noticed the blurry to be in the vertical direction, so we can perform a deconvolution with a vertical kernel of certain length (10 in the following example) and also assume the input is not noise free (assumption of 5%) -- I'm just trying to give a very superficial view here, take it easy. In Matlab, your problem is at least partially solved by doing:
f = imread('some_sudoku_cell.png');
g = deconvwnr(f, fspecial('motion', 10, 90), 0.05));
h = im2bw(g, graythresh(g)); % graythresh is the Otsu method
Here are the results from some of your cells (original, otsu, otsu of region growing, morphological enhanced image, otsu from morphological enhanced image with region growing, otsu of deconvolution):
The enhanced image was produced by performing original + tophat(original) - bottomhat(original) with a flat disk of radius 3. I manually picked the seed point for region growing and manually picked the best threshold.
For empty cells you get weird results (original and otsu of deconvlution):
But I don't think you would have trouble to detect whether a cell is empty or not (the global threshold already solves it).
Added the best results I could get with a different approach: region growing. I also attempted some other approaches, but this was the second best one.

Reshaping noisy coin into a circle form

I'm doing a coin detection using JavaCV (OpenCV wrapper) but I have a little problem when the coins are connected. If I try to erode them to separate these coins they loose their circle form and if I try to count pixels inside each coin there can be problems so that some coins can be miscounted as one that bigger. What I want to do is firstly to reshape them and make them like a circle (equal with the radius of that coin) and then count pixels inside them.
Here is my thresholded image:
And here is eroded image:
Any suggestions? Or is there any better way to break bridges between coins?
It looks similar to a problem I recently had to separate bacterial colonies growing on agar plates.
I performed a distance transform on the thresholded image (in your case you will need to invert it).
Then found the peaks of the distance map (by calculating the difference between a the dilated distance map and the distance map and finding the zero values).
Then, I assumed each peak to be the centre of a circle (coin) and the value of the peak in the distance map to be the radius of the circle.
Here is the result of your image after this pipeline:
I am new to OpenCV, and c++ so my code is probably very messy, but I did that:
int main( int argc, char** argv ){
cv::Mat objects, distance,peaks,results;
std::vector<std::vector<cv::Point> > contours;
cv::cvtColor(objects, objects, CV_BGR2GRAY);
cv::blur( objects,objects,cv::Size(3,3));
/*Applies a distance transform to "objects".
* The result is saved in "distance" */
/* In order to find the local maxima, "distance"
* is subtracted from the result of the dilatation of
* "distance". All the peaks keep the save value */
/* Now all the peaks should be exactely 0*/
/* And the non-peaks 255*/
/* Only the zero values of "peaks" that are non-zero
* in "objects" are the real peaks*/
/* The peaks that are distant from less than
* 2 pixels are merged by dilatation */
/* In order to map the peaks, findContours() is used.
* The results are stored in "contours" */
cv::findContours(peaks, contours, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE);
/* The next steps are applied only if, at least,
* one contour exists */
/* Defines vectors to store the moments of the peaks, the center
* and the theoritical circles of the object of interest*/
std::vector <cv::Moments> moms(contours.size());
std::vector <cv::Point> centers(contours.size());
std::vector<cv::Vec3f> circles(contours.size());
float rad,x,y;
/* Caculates the moments of each peak and then the center of the peak
* which are approximatively the center of each objects of interest*/
for(unsigned int i=0;i<contours.size();i++) {
moms[i]= cv::moments(contours[i]);
centers[i]= cv::Point(moms[i].m10/moms[i].m00,moms[i].m01/moms[i].m00);
x= (float) (centers[i].x);
y= (float) (centers[i].y);
if(x>0 && y>0){
rad= (float) (<float>((int)y,(int)x)+1);
circles[i][0]= x;
circles[i][3]= y;
circles[i][2]= rad;
cv::circle(results,centers[i],rad+1,cv::Scalar( 255, 0,0 ), 2, 4, 0 );
return 1;
You don't need to erode, just a good set of params for cvHoughCircles():
The code used to generate this image came from my other post: Detecting Circles, with these parameters:
CvSeq* circles = cvHoughCircles(gray, storage, CV_HOUGH_GRADIENT, 1, gray->height/12, 80, 26);
OpenCV has a function called HoughCircles() that can be applied to your case, without separating the different circles. Can you call it from JavaCV ? If so, it will do what you want (detecting and counting circles), bypassing your separation problem.
The main point is to detect the circles accurately without separating them first. Other algorithms (such as template matching can be used instead of generalized Hough transform, but you have to take into account the different sizes of the coins.
The usual approach for erosion-based object recognition is to label continuous regions in the eroded image and then re-grow them until they match the regions in the original image. Hough circles is a better idea in your case, though.
After detecting the joined coins, I recommend applying morphological operations to classify areas as "definitely coin" and "definitely not coin", apply a distance transformation, then run the watershed to determine the boundaries. This scenario is actually the demonstration example for the watershed algorithm in OpenCV − perhaps it was created in response to this question.

How to smooth a cyclic column vector

This is an OpenCV2 question.
I have a matrix representing a closed space curve.
cv::Mat_<Point3f> points;
I want to smooth it (using, for example a Gaussian kernel).
I have tried using:
cv::Mat_<Point3f> result;
cv::GaussianBlur(points, result, cv::Size(4 * sigma, 1), sigma, sigma, cv::BORDER_WRAP);
But I get the error:
Assertion failed (columnBorderType != BORDER_WRAP)
What is the best way to convolve a cyclic vector in OpenCV? ("Best" should take into account space and time requirements.)
I found a way. I repeat the matrix, then blur, then extract a range.
GaussianBlur(repeat(points, 3, 1), ret, cv::Size(0,0), sigma);
int rows = points.rows;
result = Mat(result, Range(rows, 2 * rows - 1), Range::all());
This requires extra work (and extra space?).
Edit: I now manually expand points by copying (wrapping) as many points are required by the kernel. I then crop off the extra points. This is similar to the above, but wastes less space and time.
