Technique to introduce normalisation/consistency to std dev comparison? - opencv

I am implementing a very simple segmentation algorithm for single channel images. The algorithm works like so:
For a single channel image:
Calculate the standard deviation, ie, measure how much the luminosity varies across the image.
If the stddev > 15 (aka threshold):
Divide the image into 4 cells/images
For each cell:
Repeat step 1 and step 2 (go recursive)
Else:
Draw a rectangle on the source image to signify a segment lies in these bounds.
My problem occurs because my threshold is constant and when I go recursive 15 is not longer a good signifier of whether that image is homogeneous or not. How can I introduce consistency/normalisation to my homogeneity check?
Should I resize each image to the same size (100x100)? Should my threshold be formula? Say 15 / img.rows * img.cols or 15 / MAX_HISTOGRAM_PEAK?
Edit Code:
void split_mat(const Mat& src, Mat& split1, Mat& split2, Mat& split3, Mat& split4) {
split1 = Mat(src, Rect(Point(0, 0), Point(src.cols / 2, src.rows / 2)));
split1 = Mat(src, Rect(Point(src.cols/2, 0), Point(src.cols, src.rows / 2)));
split3 = Mat(src, Rect(Point(0, src.rows/2), Point(src.cols / 2, src.rows)));
split4 = Mat(src, Rect(Point(src.cols/2, src.rows/2), Point(src.cols, src.rows)));
}
void segment_by_homogeny(const Mat& src, double threshold) {
Scalar mean, stddev;
meanStdDev(src, mean, stddev);
double dev = stddev[0]; // / (src.rows * src.cols) * 100.0;
if (dev >= threshold) {
Mat s1, s2, s3, s4;
split_mat(src, s1, s2, s3, s4);
// Go recursive and segment each sub-segment where necessary
segment_by_homogeny(s1, threshold);
segment_by_homogeny(s2, threshold);
segment_by_homogeny(s3, threshold);
segment_by_homogeny(s4, threshold);
}
else {
// Store 'segment' in global vector 'images'
// and write std dev on it
char d[255];
sprintf_s(d, "Std Dev: %f", stddev[0]);
putText(src, d, cvPoint(30, 60),
FONT_HERSHEY_COMPLEX_SMALL, 0.7, cvScalar(200, 200, 250), 1, CV_AA);
images.push_back(src);
}
}
// current usage for the example image results in inifinite recursion.
// The green and red segment never has a std dev < 25
segment_by_homogeny(img, 25);
I am expecting my algorithm to produce the following 5 segments:

You can simplify your algorithm. Because you want to divide the given region into 4 subregions, you can first divide it into the 4 subregions, then calculate the average luminosity value for each, and have your threshold on the difference between these neighbor values.

Related

Estimate camera orientation from ground 3D points?

Given a set of 3D points in camera's perspective corresponding to a planar surface (ground), is there any fast efficient method to find the orientation of the plane regarding the camera's plane? Or is it only possible by running heavier "surface matching" algorithms on the point cloud?
I've tried to use estimateAffine3D and findHomography, but my main limitation is that I don't have the point coordinates on the surface plane - I can only select a set of points from the depth images and thus must work from a set of 3D points in the camera frame.
I've written a simple geometric approach that takes a couple of points and computes vertical and horizontal angles based on depth measurement, but I fear this is both not very robust nor very precise.
EDIT: Following the suggestion by #Micka, I've attempted to fit the points to a 2D plane on the camera's frame, with the following function:
#include <opencv2/opencv.hpp>
//------------------------------------------------------------------------------
/// #brief Fits a set of 3D points to a 2D plane, by solving a system of linear equations of type aX + bY + cZ + d = 0
///
/// #param[in] points The points
///
/// #return 4x1 Mat with plane equations coefficients [a, b, c, d]
///
cv::Mat fitPlane(const std::vector< cv::Point3d >& points) {
// plane equation: aX + bY + cZ + d = 0
// assuming c=-1 -> aX + bY + d = z
cv::Mat xys = cv::Mat::ones(points.size(), 3, CV_64FC1);
cv::Mat zs = cv::Mat::ones(points.size(), 1, CV_64FC1);
// populate left and right hand matrices
for (int idx = 0; idx < points.size(); idx++) {
xys.at< double >(idx, 0) = points[idx].x;
xys.at< double >(idx, 1) = points[idx].y;
zs.at< double >(idx, 0) = points[idx].z;
}
// coeff mat
cv::Mat coeff(3, 1, CV_64FC1);
// problem is now xys * coeff = zs
// solving using SVD should output coeff
cv::SVD svd(xys);
svd.backSubst(zs, coeff);
// alternative approach -> requires mat with 3D coordinates & additional col
// solves xyzs * coeff = 0
// cv::SVD::solveZ(xyzs, coeff); // #note: data type must be double (CV_64FC1)
// check result w/ input coordinates (plane equation should output null or very small values)
double a = coeff.at< double >(0);
double b = coeff.at< double >(1);
double d = coeff.at< double >(2);
for (auto& point : points) {
std::cout << a * point.x + b * point.y + d - point.z << std::endl;
}
return coeff;
}
For simplicity purposes, it is assumed that the camera is properly calibrated and that 3D reconstruction is correct - something I already validated previously and therefore out of the scope of this issue. I use the mouse to select points on a depth/color frame pair, reconstruct the 3D coordinates and pass them into the function above.
I've also tried other approaches beyond cv::SVD::solveZ(), such as inverting xyz with cv::invert(), and with cv::solve(), but it always ended in either ridiculously small values or runtime errors regarding matrix size and/or type.

Image registration and focus stacking

Background:
I am looking to align images for a focus stacking application using a smartphone. Links to images:
First in stack: 1, Last in stack: 2, Final stacked images: 3
I.e. images are nominally the same, BUT contain:
Systematic change in FOCUS as the focal plane shifts between images
Magnification changes slightly (smartphone feature as focus changes!)
Camera moves slightly due to random vibrations.
Images need to be aligned for the focus-stacking APP to work.
Progress to date:
I use OpenCV's findTransformECC() to get alignment. It works well after some experimentation i.e. see cv2.MOTION_EUCLIDEAN for the warp_mode in ECC image alignment method which was useful to improve the initialization of the Warp matrix:
Images aligned at pixel level
60secs to process 8Mpix image (1sec for 0.5Mpix image) (on 3 year old portable PC with OpenCV release libraries)
See stacked image link above.
I briefly investigated a feature detector (SIFT). It did not align the images well, presumably due to the change in focus between images.
Code:
int scale = 1;
int scaleSmall = 4;
float scaleDiff = scaleSmall / scale;
for (i = 0; i< numImages; i++) {
file = dir + image + to_string(i) + ".jpg";
col[i] = imread(file);
resize(col[i], z[i], Size(col[i].cols/scale, col[i].rows/scale));
cvtColor(z[i], zg[i], CV_BGR2GRAY);
resize(zg[i], zgSmall[i], Size(col[i].cols / scaleSmall, col[i].rows / scaleSmall));
}
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
// See https://www.learnopencv.com/image-alignment-ecc-in-opencv-c-python/
// Define the motion model
const int warp_mode = MOTION_HOMOGRAPHY;
// Initialize the matrix to identity
if (warp_mode == MOTION_HOMOGRAPHY) {
warp_init = Mat::eye(3, 3, CV_32F);
warp_matrix = Mat::eye(3, 3, CV_32F);
warp_matrix_prev = Mat::eye(3, 3, CV_32F);
scaleTX = (Mat_<float>(3, 3) << 1, 1, scaleDiff, 1, 1, scaleDiff, 1 / scaleDiff, 1 / scaleDiff, 1);
}
else {
warp_init = Mat::eye(2, 3, CV_32F);
scaleTX = Mat::eye(2, 3, CV_32F);
warp_matrix = Mat::eye(2, 3, CV_32F);
warp_matrix_prev = Mat::eye(2, 3, CV_32F);
scaleTX = (Mat_<float>(2, 3) << 1, 1, scaleDiff, 1, 1, scaleDiff);
}
// Specify the number of iterations.
int number_of_iterations = 5000;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-8;
// Define termination criteria
TermCriteria criteria(TermCriteria::COUNT + TermCriteria::EPS, number_of_iterations, termination_eps);
for (i = 1; i < numImages; i++) {
// Check images right size
if (zg[0].rows < 10 || zg[1].rows < 10)
return;
// Run the ECC algorithm at start to get an initial guess. The results are stored in warp_matrix.
if (i == 1) {
findTransformECC(zgSmall[0], zgSmall[i], warp_init, warp_mode, criteria );
// See https://stackoverflow.com/questions/45997891/cv2-motion-euclidean-for-the-warp-mode-in-ecc-image-alignment-method
warp_matrix = warp_init * scaleTX;
}
// Warp Matrix from previous iteration is used as initialisation
findTransformECC(zg[0], zg[i], warp_matrix, warp_mode, criteria);
if (warp_mode != MOTION_HOMOGRAPHY) {
warpAffine(zg[i], ag[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
warpAffine(z[i], acol[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
}
else {
// Use warpPerspective for Homography
warpPerspective(z[i], acol[i], warp_matrix, z[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
warpPerspective(zg[i], ag[i], warp_matrix, zg[i].size(), INTER_LINEAR + WARP_INVERSE_MAP);
}
}
}
Question:
Can the image registration speed be significantly improved (using the same hardware)?
there are at least 3 improvements that can be done:
5000 iterations may be unnecessary. Try to limit it to 500. Moreover transforming images to gradient domain may help. See GetGradient function from this tutorial.
You can assume that perspective effects are negligible so you can change warp_mode to MOTION_AFFINE to limit the degrees of freedom from 8 to 6.
You can also try another, much faster approach that is based on phase correlation (frequency domain). In the standard way it only estimates translation between images but you can transfer them to the log-polar space to get translation, rotation and scale invariance. This code implements the third approach.

openCV 3.0 recoverPose wrong results

Does anyone can using openCV 3.0 recoverPose function with good results?
I've got:
cv::Mat r;
cv::Mat t;
cv::Mat E = cv::findEssentialMat(features1, features2);
cv::recoverPose(E, features1, features1, r, t);
float xAngle = radToDeg(atan2f(r.at<float>(2, 1), r.at<float>(2, 2)));
float yAngle = radToDeg(atan2f(-r.at<float>(2, 0), sqrtf(r.at<float>(2, 1) * r.at<float>(2, 1) + r.at<float>(2, 2) * r.at<float>(2, 2))));
float zAngle = radToDeg(atan2f(r.at<float>(1, 0), r.at<float>(0, 0)));
As input I use one image 1836x1836 dimensions and another image 1836x1836 which is just rotated 90 degrees to the left. I have rotated it using computer program so it is exactly rotate 90 degrees.
I expect result:
xAngle: 0
yAngle: 0
zAngle: 90 (or -90 depending on Z direction)
Unfortunately my results are:
xAngle: 90
yAngle: 0.113659
zAngle: 180
Can anyone help me with it?
The Essential matrix cannot be used to describe pure rotation. If you know that your images are related only by a rotation with no translation, then you have to estimate the homography (aka projective transformation) between them.

How do you map Kinect's depth data to its RGB color?

I'm working with a given dataset using OpenCV, without any Kinect by my side. And I would like to map the given depth data to its RGB counterpart (so that I can get the actual color and the depth)
Since I'm using OpenCV and C++, and don't own a Kinect, sadly I can't utilize MapDepthFrameToColorFrame method from the official Kinect API.
From the given cameras' intrinsics and distortion coefficients, I could map the depth to world coordinates, and back to RGB based on the algorithm provided here
Vec3f depthToW( int x, int y, float depth ){
Vec3f result;
result[0] = (float) (x - depthCX) * depth / depthFX;
result[1] = (float) (y - depthCY) * depth / depthFY;
result[2] = (float) depth;
return result;
}
Vec2i wToRGB( const Vec3f & point ) {
Mat p3d( point );
p3d = extRotation * p3d + extTranslation;
float x = p3d.at<float>(0, 0);
float y = p3d.at<float>(1, 0);
float z = p3d.at<float>(2, 0);
Vec2i result;
result[0] = (int) round( (x * rgbFX / z) + rgbCX );
result[1] = (int) round( (y * rgbFY / z) + rgbCY );
return result;
}
void map( Mat& rgb, Mat& depth ) {
/* intrinsics are focal points and centers of camera */
undistort( rgb, rgb, rgbIntrinsic, rgbDistortion );
undistort( depth, depth, depthIntrinsic, depthDistortion );
Mat color = Mat( depth.size(), CV_8UC3, Scalar(0) );
ushort * raw_image_ptr;
for( int y = 0; y < depth.rows; y++ ) {
raw_image_ptr = depth.ptr<ushort>( y );
for( int x = 0; x < depth.cols; x++ ) {
if( raw_image_ptr[x] >= 2047 || raw_image_ptr[x] <= 0 )
continue;
float depth_value = depthMeters[ raw_image_ptr[x] ];
Vec3f depth_coord = depthToW( y, x, depth_value );
Vec2i rgb_coord = wToRGB( depth_coord );
color.at<Vec3b>(y, x) = rgb.at<Vec3b>(rgb_coord[0], rgb_coord[1]);
}
}
But the result seems to be misaligned. I can't manually set the translations, since the dataset is obtained from 3 different Kinects, and each of them are misaligned in different direction. You could see one of it below (Left: undistorted RGB, Middle: undistorted Depth, Right: mapped RGB to Depth)
My question is, what should I do at this point? Did I miss a step while trying to project either depth to world or world back to RGB? Can anyone who has experienced with stereo camera point out my missteps?
I assume you would need to calibrate the depth sensor with the RGB data in the same way you would calibrate a stereo cameras. OpenCV has some functions (and tutorials) that you may be able to leverage.
A few other things that may be useful
http://www.ros.org/wiki/kinect_calibration/technical
https://github.com/robbeofficial/KinectCalib
http://www.mathworks.com/matlabcentral/linkexchange/links/2882-kinect-calibration-toolbox
This contains a paper on how to do it.
OpenCV has no functions for aligning depth stream to color video stream. But I know that there is special function named MapDepthFrameToColorFrame in "Kinect for Windows SDK".
I have no code for example, but hope that this is good point to start.
Upd:
Here is same example of mapping color image to depth using KinectSDK with interface to OpenCV (not my code).
It looks like you are not considering in your solution the extrinsics between both cameras.
Yes, you didn't consider the transformation between RGB and Depth.
But you can compute this matrix by using cvStereoCalibrate() method which just pass the image sequences of both RGB and Depth with checkerboard corners to it.
The detail you can find in OpecvCV documentation:
http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#double stereoCalibrate(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints1, InputArrayOfArrays imagePoints2, InputOutputArray cameraMatrix1, InputOutputArray distCoeffs1, InputOutputArray cameraMatrix2, InputOutputArray distCoeffs2, Size imageSize, OutputArray R, OutputArray T, OutputArray E, OutputArray F, TermCriteria criteria, int flags)
And the whole method idea behind this is:
color uv <- color normalize <- color space <- DtoC transformation <- depth space <- depth normalize <- depth uv
(uc,vc) <- <- ExtrCol * (pc) <- stereo calibrate MAT <- ExtrDep^-1 * (pd) <- <(ud - cx)*d / fx, (vd-cy)*d/fy, d> <- (ud, vd)
If you want to add distortion to RGB, you just need to follow the step in:
http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html

Is there a way to detect if an image is blurry? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I was wondering if there is a way to determine if an image is blurry or not by analyzing the image data.
Another very simple way to estimate the sharpness of an image is to use a Laplace (or LoG) filter and simply pick the maximum value. Using a robust measure like a 99.9% quantile is probably better if you expect noise (i.e. picking the Nth-highest contrast instead of the highest contrast.) If you expect varying image brightness, you should also include a preprocessing step to normalize image brightness/contrast (e.g. histogram equalization).
I've implemented Simon's suggestion and this one in Mathematica, and tried it on a few test images:
The first test blurs the test images using a Gaussian filter with a varying kernel size, then calculates the FFT of the blurred image and takes the average of the 90% highest frequencies:
testFft[img_] := Table[
(
blurred = GaussianFilter[img, r];
fft = Fourier[ImageData[blurred]];
{w, h} = Dimensions[fft];
windowSize = Round[w/2.1];
Mean[Flatten[(Abs[
fft[[w/2 - windowSize ;; w/2 + windowSize,
h/2 - windowSize ;; h/2 + windowSize]]])]]
), {r, 0, 10, 0.5}]
Result in a logarithmic plot:
The 5 lines represent the 5 test images, the X axis represents the Gaussian filter radius. The graphs are decreasing, so the FFT is a good measure for sharpness.
This is the code for the "highest LoG" blurriness estimator: It simply applies an LoG filter and returns the brightest pixel in the filter result:
testLaplacian[img_] := Table[
(
blurred = GaussianFilter[img, r];
Max[Flatten[ImageData[LaplacianGaussianFilter[blurred, 1]]]];
), {r, 0, 10, 0.5}]
Result in a logarithmic plot:
The spread for the un-blurred images is a little better here (2.5 vs 3.3), mainly because this method only uses the strongest contrast in the image, while the FFT is essentially a mean over the whole image. The functions are also decreasing faster, so it might be easier to set a "blurry" threshold.
Yes, it is. Compute the Fast Fourier Transform and analyse the result. The Fourier transform tells you which frequencies are present in the image. If there is a low amount of high frequencies, then the image is blurry.
Defining the terms 'low' and 'high' is up to you.
Edit:
As stated in the comments, if you want a single float representing the blurryness of a given image, you have to work out a suitable metric.
nikie's answer provide such a metric. Convolve the image with a Laplacian kernel:
1
1 -4 1
1
And use a robust maximum metric on the output to get a number which you can use for thresholding. Try to avoid smoothing too much the images before computing the Laplacian, because you will only find out that a smoothed image is indeed blurry :-).
During some work with an auto-focus lens, I came across this very useful set of algorithms for detecting image focus. It's implemented in MATLAB, but most of the functions are quite easy to port to OpenCV with filter2D.
It's basically a survey implementation of many focus measurement algorithms. If you want to read the original papers, references to the authors of the algorithms are provided in the code. The 2012 paper by Pertuz, et al. Analysis of focus measure operators for shape from focus (SFF) gives a great review of all of these measure as well as their performance (both in terms of speed and accuracy as applied to SFF).
EDIT: Added MATLAB code just in case the link dies.
function FM = fmeasure(Image, Measure, ROI)
%This function measures the relative degree of focus of
%an image. It may be invoked as:
%
% FM = fmeasure(Image, Method, ROI)
%
%Where
% Image, is a grayscale image and FM is the computed
% focus value.
% Method, is the focus measure algorithm as a string.
% see 'operators.txt' for a list of focus
% measure methods.
% ROI, Image ROI as a rectangle [xo yo width heigth].
% if an empty argument is passed, the whole
% image is processed.
%
% Said Pertuz
% Abr/2010
if ~isempty(ROI)
Image = imcrop(Image, ROI);
end
WSize = 15; % Size of local window (only some operators)
switch upper(Measure)
case 'ACMO' % Absolute Central Moment (Shirvaikar2004)
if ~isinteger(Image), Image = im2uint8(Image);
end
FM = AcMomentum(Image);
case 'BREN' % Brenner's (Santos97)
[M N] = size(Image);
DH = Image;
DV = Image;
DH(1:M-2,:) = diff(Image,2,1);
DV(:,1:N-2) = diff(Image,2,2);
FM = max(DH, DV);
FM = FM.^2;
FM = mean2(FM);
case 'CONT' % Image contrast (Nanda2001)
ImContrast = inline('sum(abs(x(:)-x(5)))');
FM = nlfilter(Image, [3 3], ImContrast);
FM = mean2(FM);
case 'CURV' % Image Curvature (Helmli2001)
if ~isinteger(Image), Image = im2uint8(Image);
end
M1 = [-1 0 1;-1 0 1;-1 0 1];
M2 = [1 0 1;1 0 1;1 0 1];
P0 = imfilter(Image, M1, 'replicate', 'conv')/6;
P1 = imfilter(Image, M1', 'replicate', 'conv')/6;
P2 = 3*imfilter(Image, M2, 'replicate', 'conv')/10 ...
-imfilter(Image, M2', 'replicate', 'conv')/5;
P3 = -imfilter(Image, M2, 'replicate', 'conv')/5 ...
+3*imfilter(Image, M2, 'replicate', 'conv')/10;
FM = abs(P0) + abs(P1) + abs(P2) + abs(P3);
FM = mean2(FM);
case 'DCTE' % DCT energy ratio (Shen2006)
FM = nlfilter(Image, [8 8], #DctRatio);
FM = mean2(FM);
case 'DCTR' % DCT reduced energy ratio (Lee2009)
FM = nlfilter(Image, [8 8], #ReRatio);
FM = mean2(FM);
case 'GDER' % Gaussian derivative (Geusebroek2000)
N = floor(WSize/2);
sig = N/2.5;
[x,y] = meshgrid(-N:N, -N:N);
G = exp(-(x.^2+y.^2)/(2*sig^2))/(2*pi*sig);
Gx = -x.*G/(sig^2);Gx = Gx/sum(Gx(:));
Gy = -y.*G/(sig^2);Gy = Gy/sum(Gy(:));
Rx = imfilter(double(Image), Gx, 'conv', 'replicate');
Ry = imfilter(double(Image), Gy, 'conv', 'replicate');
FM = Rx.^2+Ry.^2;
FM = mean2(FM);
case 'GLVA' % Graylevel variance (Krotkov86)
FM = std2(Image);
case 'GLLV' %Graylevel local variance (Pech2000)
LVar = stdfilt(Image, ones(WSize,WSize)).^2;
FM = std2(LVar)^2;
case 'GLVN' % Normalized GLV (Santos97)
FM = std2(Image)^2/mean2(Image);
case 'GRAE' % Energy of gradient (Subbarao92a)
Ix = Image;
Iy = Image;
Iy(1:end-1,:) = diff(Image, 1, 1);
Ix(:,1:end-1) = diff(Image, 1, 2);
FM = Ix.^2 + Iy.^2;
FM = mean2(FM);
case 'GRAT' % Thresholded gradient (Snatos97)
Th = 0; %Threshold
Ix = Image;
Iy = Image;
Iy(1:end-1,:) = diff(Image, 1, 1);
Ix(:,1:end-1) = diff(Image, 1, 2);
FM = max(abs(Ix), abs(Iy));
FM(FM<Th)=0;
FM = sum(FM(:))/sum(sum(FM~=0));
case 'GRAS' % Squared gradient (Eskicioglu95)
Ix = diff(Image, 1, 2);
FM = Ix.^2;
FM = mean2(FM);
case 'HELM' %Helmli's mean method (Helmli2001)
MEANF = fspecial('average',[WSize WSize]);
U = imfilter(Image, MEANF, 'replicate');
R1 = U./Image;
R1(Image==0)=1;
index = (U>Image);
FM = 1./R1;
FM(index) = R1(index);
FM = mean2(FM);
case 'HISE' % Histogram entropy (Krotkov86)
FM = entropy(Image);
case 'HISR' % Histogram range (Firestone91)
FM = max(Image(:))-min(Image(:));
case 'LAPE' % Energy of laplacian (Subbarao92a)
LAP = fspecial('laplacian');
FM = imfilter(Image, LAP, 'replicate', 'conv');
FM = mean2(FM.^2);
case 'LAPM' % Modified Laplacian (Nayar89)
M = [-1 2 -1];
Lx = imfilter(Image, M, 'replicate', 'conv');
Ly = imfilter(Image, M', 'replicate', 'conv');
FM = abs(Lx) + abs(Ly);
FM = mean2(FM);
case 'LAPV' % Variance of laplacian (Pech2000)
LAP = fspecial('laplacian');
ILAP = imfilter(Image, LAP, 'replicate', 'conv');
FM = std2(ILAP)^2;
case 'LAPD' % Diagonal laplacian (Thelen2009)
M1 = [-1 2 -1];
M2 = [0 0 -1;0 2 0;-1 0 0]/sqrt(2);
M3 = [-1 0 0;0 2 0;0 0 -1]/sqrt(2);
F1 = imfilter(Image, M1, 'replicate', 'conv');
F2 = imfilter(Image, M2, 'replicate', 'conv');
F3 = imfilter(Image, M3, 'replicate', 'conv');
F4 = imfilter(Image, M1', 'replicate', 'conv');
FM = abs(F1) + abs(F2) + abs(F3) + abs(F4);
FM = mean2(FM);
case 'SFIL' %Steerable filters (Minhas2009)
% Angles = [0 45 90 135 180 225 270 315];
N = floor(WSize/2);
sig = N/2.5;
[x,y] = meshgrid(-N:N, -N:N);
G = exp(-(x.^2+y.^2)/(2*sig^2))/(2*pi*sig);
Gx = -x.*G/(sig^2);Gx = Gx/sum(Gx(:));
Gy = -y.*G/(sig^2);Gy = Gy/sum(Gy(:));
R(:,:,1) = imfilter(double(Image), Gx, 'conv', 'replicate');
R(:,:,2) = imfilter(double(Image), Gy, 'conv', 'replicate');
R(:,:,3) = cosd(45)*R(:,:,1)+sind(45)*R(:,:,2);
R(:,:,4) = cosd(135)*R(:,:,1)+sind(135)*R(:,:,2);
R(:,:,5) = cosd(180)*R(:,:,1)+sind(180)*R(:,:,2);
R(:,:,6) = cosd(225)*R(:,:,1)+sind(225)*R(:,:,2);
R(:,:,7) = cosd(270)*R(:,:,1)+sind(270)*R(:,:,2);
R(:,:,7) = cosd(315)*R(:,:,1)+sind(315)*R(:,:,2);
FM = max(R,[],3);
FM = mean2(FM);
case 'SFRQ' % Spatial frequency (Eskicioglu95)
Ix = Image;
Iy = Image;
Ix(:,1:end-1) = diff(Image, 1, 2);
Iy(1:end-1,:) = diff(Image, 1, 1);
FM = mean2(sqrt(double(Iy.^2+Ix.^2)));
case 'TENG'% Tenengrad (Krotkov86)
Sx = fspecial('sobel');
Gx = imfilter(double(Image), Sx, 'replicate', 'conv');
Gy = imfilter(double(Image), Sx', 'replicate', 'conv');
FM = Gx.^2 + Gy.^2;
FM = mean2(FM);
case 'TENV' % Tenengrad variance (Pech2000)
Sx = fspecial('sobel');
Gx = imfilter(double(Image), Sx, 'replicate', 'conv');
Gy = imfilter(double(Image), Sx', 'replicate', 'conv');
G = Gx.^2 + Gy.^2;
FM = std2(G)^2;
case 'VOLA' % Vollath's correlation (Santos97)
Image = double(Image);
I1 = Image; I1(1:end-1,:) = Image(2:end,:);
I2 = Image; I2(1:end-2,:) = Image(3:end,:);
Image = Image.*(I1-I2);
FM = mean2(Image);
case 'WAVS' %Sum of Wavelet coeffs (Yang2003)
[C,S] = wavedec2(Image, 1, 'db6');
H = wrcoef2('h', C, S, 'db6', 1);
V = wrcoef2('v', C, S, 'db6', 1);
D = wrcoef2('d', C, S, 'db6', 1);
FM = abs(H) + abs(V) + abs(D);
FM = mean2(FM);
case 'WAVV' %Variance of Wav...(Yang2003)
[C,S] = wavedec2(Image, 1, 'db6');
H = abs(wrcoef2('h', C, S, 'db6', 1));
V = abs(wrcoef2('v', C, S, 'db6', 1));
D = abs(wrcoef2('d', C, S, 'db6', 1));
FM = std2(H)^2+std2(V)+std2(D);
case 'WAVR'
[C,S] = wavedec2(Image, 3, 'db6');
H = abs(wrcoef2('h', C, S, 'db6', 1));
V = abs(wrcoef2('v', C, S, 'db6', 1));
D = abs(wrcoef2('d', C, S, 'db6', 1));
A1 = abs(wrcoef2('a', C, S, 'db6', 1));
A2 = abs(wrcoef2('a', C, S, 'db6', 2));
A3 = abs(wrcoef2('a', C, S, 'db6', 3));
A = A1 + A2 + A3;
WH = H.^2 + V.^2 + D.^2;
WH = mean2(WH);
WL = mean2(A);
FM = WH/WL;
otherwise
error('Unknown measure %s',upper(Measure))
end
end
%************************************************************************
function fm = AcMomentum(Image)
[M N] = size(Image);
Hist = imhist(Image)/(M*N);
Hist = abs((0:255)-255*mean2(Image))'.*Hist;
fm = sum(Hist);
end
%******************************************************************
function fm = DctRatio(M)
MT = dct2(M).^2;
fm = (sum(MT(:))-MT(1,1))/MT(1,1);
end
%************************************************************************
function fm = ReRatio(M)
M = dct2(M);
fm = (M(1,2)^2+M(1,3)^2+M(2,1)^2+M(2,2)^2+M(3,1)^2)/(M(1,1)^2);
end
%******************************************************************
A few examples of OpenCV versions:
// OpenCV port of 'LAPM' algorithm (Nayar89)
double modifiedLaplacian(const cv::Mat& src)
{
cv::Mat M = (Mat_<double>(3, 1) << -1, 2, -1);
cv::Mat G = cv::getGaussianKernel(3, -1, CV_64F);
cv::Mat Lx;
cv::sepFilter2D(src, Lx, CV_64F, M, G);
cv::Mat Ly;
cv::sepFilter2D(src, Ly, CV_64F, G, M);
cv::Mat FM = cv::abs(Lx) + cv::abs(Ly);
double focusMeasure = cv::mean(FM).val[0];
return focusMeasure;
}
// OpenCV port of 'LAPV' algorithm (Pech2000)
double varianceOfLaplacian(const cv::Mat& src)
{
cv::Mat lap;
cv::Laplacian(src, lap, CV_64F);
cv::Scalar mu, sigma;
cv::meanStdDev(lap, mu, sigma);
double focusMeasure = sigma.val[0]*sigma.val[0];
return focusMeasure;
}
// OpenCV port of 'TENG' algorithm (Krotkov86)
double tenengrad(const cv::Mat& src, int ksize)
{
cv::Mat Gx, Gy;
cv::Sobel(src, Gx, CV_64F, 1, 0, ksize);
cv::Sobel(src, Gy, CV_64F, 0, 1, ksize);
cv::Mat FM = Gx.mul(Gx) + Gy.mul(Gy);
double focusMeasure = cv::mean(FM).val[0];
return focusMeasure;
}
// OpenCV port of 'GLVN' algorithm (Santos97)
double normalizedGraylevelVariance(const cv::Mat& src)
{
cv::Scalar mu, sigma;
cv::meanStdDev(src, mu, sigma);
double focusMeasure = (sigma.val[0]*sigma.val[0]) / mu.val[0];
return focusMeasure;
}
No guarantees on whether or not these measures are the best choice for your problem, but if you track down the papers associated with these measures, they may give you more insight. Hope you find the code useful! I know I did.
Building off of Nike's answer. Its straightforward to implement the laplacian based method with opencv:
short GetSharpness(char* data, unsigned int width, unsigned int height)
{
// assumes that your image is already in planner yuv or 8 bit greyscale
IplImage* in = cvCreateImage(cvSize(width,height),IPL_DEPTH_8U,1);
IplImage* out = cvCreateImage(cvSize(width,height),IPL_DEPTH_16S,1);
memcpy(in->imageData,data,width*height);
// aperture size of 1 corresponds to the correct matrix
cvLaplace(in, out, 1);
short maxLap = -32767;
short* imgData = (short*)out->imageData;
for(int i =0;i<(out->imageSize/2);i++)
{
if(imgData[i] > maxLap) maxLap = imgData[i];
}
cvReleaseImage(&in);
cvReleaseImage(&out);
return maxLap;
}
Will return a short indicating the maximum sharpness detected, which based on my tests on real world samples, is a pretty good indicator of if a camera is in focus or not. Not surprisingly, normal values are scene dependent but much less so than the FFT method which has to high of a false positive rate to be useful in my application.
I came up with a totally different solution.
I needed to analyse video still frames to find the sharpest one in every (X) frames. This way, I would detect motion blur and/or out of focus images.
I ended up using Canny Edge detection and I got VERY VERY good results with almost every kind of video (with nikie's method, I had problems with digitalized VHS videos and heavy interlaced videos).
I optimized the performance by setting a region of interest (ROI) on the original image.
Using EmguCV :
//Convert image using Canny
using (Image<Gray, byte> imgCanny = imgOrig.Canny(225, 175))
{
//Count the number of pixel representing an edge
int nCountCanny = imgCanny.CountNonzero()[0];
//Compute a sharpness grade:
//< 1.5 = blurred, in movement
//de 1.5 à 6 = acceptable
//> 6 =stable, sharp
double dSharpness = (nCountCanny * 1000.0 / (imgCanny.Cols * imgCanny.Rows));
}
Thanks nikie for that great Laplace suggestion.
OpenCV docs pointed me in the same direction:
using python, cv2 (opencv 2.4.10), and numpy...
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
numpy.max(cv2.convertScaleAbs(cv2.Laplacian(gray, 3)))
result is between 0-255. I found anything over 200ish is very in focus, and by 100, it's noticeably blurry. the max never really gets much under 20 even if it's completely blurred.
One way which I'm currently using measures the spread of edges in the image. Look for this paper:
#ARTICLE{Marziliano04perceptualblur,
author = {Pina Marziliano and Frederic Dufaux and Stefan Winkler and Touradj Ebrahimi},
title = {Perceptual blur and ringing metrics: Application to JPEG2000,” Signal Process},
journal = {Image Commun},
year = {2004},
pages = {163--172} }
It's usually behind a paywall but I've seen some free copies around. Basically, they locate vertical edges in an image, and then measure how wide those edges are. Averaging the width gives the final blur estimation result for the image. Wider edges correspond to blurry images, and vice versa.
This problem belongs to the field of no-reference image quality estimation. If you look it up on Google Scholar, you'll get plenty of useful references.
EDIT
Here's a plot of the blur estimates obtained for the 5 images in nikie's post. Higher values correspond to greater blur. I used a fixed-size 11x11 Gaussian filter and varied the standard deviation (using imagemagick's convert command to obtain the blurred images).
If you compare images of different sizes, don't forget to normalize by the image width, since larger images will have wider edges.
Finally, a significant problem is distinguishing between artistic blur and undesired blur (caused by focus miss, compression, relative motion of the subject to the camera), but that is beyond simple approaches like this one. For an example of artistic blur, have a look at the Lenna image: Lenna's reflection in the mirror is blurry, but her face is perfectly in focus. This contributes to a higher blur estimate for the Lenna image.
Answers above elucidated many things, but I think it is useful to make a conceptual distinction.
What if you take a perfectly on-focus picture of a blurred image?
The blurring detection problem is only well posed when you have a reference. If you need to design, e.g., an auto-focus system, you compare a sequence of images taken with different degrees of blurring, or smoothing, and you try to find the point of minimum blurring within this set. I other words you need to cross reference the various images using one of the techniques illustrated above (basically--with various possible levels of refinement in the approach--looking for the one image with the highest high-frequency content).
I tried solution based on Laplacian filter from this post. It didn't help me. So, I tried the solution from this post and it was good for my case (but is slow):
import cv2
image = cv2.imread("test.jpeg")
height, width = image.shape[:2]
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
def px(x, y):
return int(gray[y, x])
sum = 0
for x in range(width-1):
for y in range(height):
sum += abs(px(x, y) - px(x+1, y))
Less blurred image has maximum sum value!
You can also tune speed and accuracy by changing step, e.g.
this part
for x in range(width - 1):
you can replace with this one
for x in range(0, width - 1, 10):
Matlab code of two methods that have been published in highly regarded journals (IEEE Transactions on Image Processing) are available here: https://ivulab.asu.edu/software
check the CPBDM and JNBM algorithms. If you check the code it's not very hard to be ported and incidentally it is based on the Marzialiano's method as basic feature.
i implemented it use fft in matlab and check histogram of the fft compute mean and std but also fit function can be done
fa = abs(fftshift(fft(sharp_img)));
fb = abs(fftshift(fft(blured_img)));
f1=20*log10(0.001+fa);
f2=20*log10(0.001+fb);
figure,imagesc(f1);title('org')
figure,imagesc(f2);title('blur')
figure,hist(f1(:),100);title('org')
figure,hist(f2(:),100);title('blur')
mf1=mean(f1(:));
mf2=mean(f2(:));
mfd1=median(f1(:));
mfd2=median(f2(:));
sf1=std(f1(:));
sf2=std(f2(:));
That's what I do in Opencv to detect focus quality in a region:
Mat grad;
int scale = 1;
int delta = 0;
int ddepth = CV_8U;
Mat grad_x, grad_y;
Mat abs_grad_x, abs_grad_y;
/// Gradient X
Sobel(matFromSensor, grad_x, ddepth, 1, 0, 3, scale, delta, BORDER_DEFAULT);
/// Gradient Y
Sobel(matFromSensor, grad_y, ddepth, 0, 1, 3, scale, delta, BORDER_DEFAULT);
convertScaleAbs(grad_x, abs_grad_x);
convertScaleAbs(grad_y, abs_grad_y);
addWeighted(abs_grad_x, 0.5, abs_grad_y, 0.5, 0, grad);
cv::Scalar mu, sigma;
cv::meanStdDev(grad, /* mean */ mu, /*stdev*/ sigma);
focusMeasure = mu.val[0] * mu.val[0];

Resources