I try to do this face recognition example, with Eigenfaces. And cannot understand what these functions subspaceProject() and subspaceReconstruct() actually do. I tried to search on http://docs.opencv.org/ but there was not any link to description.
Could you explain me what this part of the code actually do?
Mat evs = Mat(W, Range::all(), Range(0, num_components));
Mat projection = subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(evs, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
I mean what these functions do, what kind of cv::Mat they return ?
subspaceProject() will basically give you a dimensionality reduction.
projection = (images[0] - mean) * evs
Subtracting the mean ensures that the images approximate a subspace.
Presumably evs is the truncated right singular vectors.
and for subspaceReconstruct()
reconstruction = projection * transpose(evs) + mean
The reconstruction is just the reverse of the projection, except since evs is truncated, it can not be perfect.
See PCA
Related
I want to know about why do we normalize the homography or fundamental matrix? Here is the code in particular.
H = H * (1.0 / H[2, 2]) # Normalization step. H is [3, 3] matrix.
I can understand that we have to normalize the data before computing SVD because of instability caused by linear least squares but why do we normalize it in end?
A homography in 3D space has 8 degrees of freedom by definition, mapping from one plane to another using perspective. Such a homography can be defined by giving four points, which makes eight coordinates (scalars).
A 3x3 matrix has 9 elements, so it has 9 degrees of freedom. That is one degree more than needed for a homography.
The homography doesn't change when the matrix is scaled (multiplied by a scalar). All the math works the same. You don't need to normalize your homography matrix.
It is a good idea to normalize.
For one, it makes the arithmetic somewhat tamer. Have some wikipedia links to fields of study because weaving all these into a coherent sentence... doesn't add anything:
Numerical analysis, Condition number, Floating-point arithmetic, Numerical error, Numerical stability, ...
Also, normalization makes the matrix easier for humans to interpret. The most common normalization is to scale the matrix such that the last element becomes 1. That is convenient because this whole math happens in a projective space, where the projection causes points to be mapped to the w=1 plane, making vectors have a 1 for the last element.
How is the homography matrix provided to you?
For example, in the scene that some library function calculates and provides the homography matrix to you,
if the function specification doesn't mention about the scale...
In an extreme case, the function can be implemented as:
Matrix3x3 CalculateHomographyMatrix( some arguments )
{
Matrix3x3 H = ...; //Homogoraphy Calculation
return Non_Zero_Random_Value * H; //Wow!
}
Element values may become very large or very small and using such values to your process may cause problems (floating point computation errors).
I'm solving VSLAM task using 2d-3d algo based on OpenCV library. Now I'm trying to make georeferencing using GPS data. I transform R, t of each camera and then triangulate matched points using trivial function
Triangulate(const cv::KeyPoint &kp1, const cv::KeyPoint &kp2, const cv::Mat &P1, const cv::Mat &P2, cv::Mat &x3D) {
cv::Mat A(4,4,CV_32F);
A.row(0) = kp1.pt.x*P1.row(2)-P1.row(0);
A.row(1) = kp1.pt.y*P1.row(2)-P1.row(1);
A.row(2) = kp2.pt.x*P2.row(2)-P2.row(0);
A.row(3) = kp2.pt.y*P2.row(2)-P2.row(1);
cv::Mat u,w,vt;
cv::SVD::compute(A,w,u,vt,cv::SVD::MODIFY_A| cv::SVD::FULL_UV);
x3D = vt.row(3).t();
x3D = x3D.rowRange(0,3)/x3D.at<float>(3); }
, where kp1 and kp2 - keypoint on left-right image, P1, P2 - projection matrices
I have faced strange problem: when I'm making simple shift for cameras centers by some huge constant, I've got big reprojection errors on old suitable triangulated points. Is SVD decomposition for points triangulation sensitive to cameras centers scale?
Just mistake in another part of code. Sorry.
I found on the internet that laplacian method is quite good technique to compute the sharpness of a image. I was trying to implement it in opencv 2.4.10. How can I get the sharpness measure after applying the Laplacian function? Below is the code:
Mat src_gray, dst;
int kernel_size = 3;
int scale = 1;
int delta = 0;
int ddepth = CV_16S;
GaussianBlur( src, src, Size(3,3), 0, 0, BORDER_DEFAULT );
/// Convert the image to grayscale
cvtColor( src, src_gray, CV_RGB2GRAY );
/// Apply Laplace function
Mat abs_dst;
Laplacian( src_gray, dst, ddepth, kernel_size, scale, delta, BORDER_DEFAULT );
//compute sharpness
??
Can someone please guide me on this?
Possible duplicate of: Is there a way to detect if an image is blurry?
so your focus measure is:
cv::Laplacian(src_gray, dst, CV_64F);
cv::Scalar mu, sigma;
cv::meanStdDev(dst, mu, sigma);
double focusMeasure = sigma.val[0] * sigma.val[0];
Edit #1:
Okay, so a well focused image is expected to have sharper edges, so the use of image gradients are instrumental in order to determine a reliable focus measure. Given an image gradient, the focus measure pools the data at each point as an unique value.
The use of second derivatives is one technique for passing the high spatial frequencies, which are associated with sharp edges. As a second derivative operator we use the Laplacian operator, that is approximated using the mask:
To pool the data at each point, we use two methods. The first one is the sum of all the absolute values, driving to the following focus measure:
where L(m, n) is the convolution of the input image I(m, n) with the mask L. The second method calculates the variance of the absolute values, providing a new focus measure given by:
where L overline is the mean of absolute values.
Read the article
J.L. Pech-Pacheco, G. Cristobal, J. Chamorro-Martinez, J.
Fernandez-Valdivia, "Diatom autofocusing in brightfield microscopy: a
comparative study", 15th International Conference on Pattern
Recognition, 2000. (Volume:3 )
for more information.
Not exactly the answer, but I got a formula using an intuitive approach that worked on the wild.
I'm currently working in a script to detect multiple faces in a picture with a crowd, using mtcnn , which it worked very well, however it also detected many faces so blurry that you couldn't say it was properly a face.
Example image:
Faces detected:
Matrix of detected faces:
mtcnn detected about 123 faces, however many of them had little resemblance as a face. In fact, many faces look more like a stain than anything else...
So I was looking a way of 'filtering' those blurry faces. I tried the Laplacian filter and FFT way of filtering I found on this answer , however I had inconsistent results and poor filtering results.
I turned my research in computer vision topics, and finally tried to implement an 'intuitive' way of filtering using the following principle:
When more blurry is an image, less 'edges' we have
If we compare a crisp image with a blurred version of the same image, the results tends to 'soften' any edges or adjacent contrasting regions. Based on that principle, I was finding a way of weighting edges and then a simple way of 'measuring' the results to get a confidence value.
I took advantage of Canny detection in OpenCV and then apply a mean value of the result (Python):
def getBlurValue(image):
canny = cv2.Canny(image, 50,250)
return np.mean(canny)
Canny return 2x2 array same image size . I selected threshold 50,250 but it can be changed depending of your image and scenario.
Then I got the average value of the canny result, (definitively a formula to be improved if you know what you're doing).
When an image is blurred the result will get a value tending to zero, while crisp image tend to be a positive value, higher when crisper is the image.
This value depend on the images and threshold, so it is not a universal solution for every scenario, however a best value can be achieved normalizing the result and averaging all the faces (I need more work on that subject).
In the example, the values are in the range 0-27.
I averaged all faces and I got about a 3.7 value of blur
If I filter images above 3.7:
So I kept with mosth crisp faces:
That consistently gave me better results than the other tests.
Ok, you got me. This is a tricky way of detecting a blurriness values inside the same image space. But I hope people can take advantage of this findings and apply what I learned in its own projects.
When using OpenCV's findHomography function to estimate an homography between two sets of points, from different images, you will sometimes get a bad homography due to outliers within your input points, even if you use RANSAC or LMEDS.
// opencv java example:
Mat H = Calib3d.findHomography( src_points, dst_points, Calib3d.RANSAC, 10 );
How can you tell if the resulting 3x3 homography matrix is acceptable or not?
I have looked for an answer to this here in Stackoverflow and in Google and was unable to find it.
I found this article, but it is a bit cryptic to me:
"The geometric error for homographies"
The best way to tell if the homography is acceptable is.
1- Take the points of one image and reproject them using the computed homography.
//for one 3D point, this would be the projection
px' = H * px;
py' = H * py;
pz' = H * pz;
2- Calculate the euclidean distance between the reprojected points and the real points in the image.
Reprojection error for one point. p is the projected point and q is the real point.
3- Establish a treshold that decides if the reprojection error is acceptable.
For example, an error greater than one pixel wouldn't be acceptable for many tracking applications.
How do I retrieve the rotation matrix, the translation vector and maybe some scaling factors of each camera using OpenCV when I have pictures of an object from the view of each of these cameras? For every picture I have the image coordinates of several feature points. Not all feature points are visible in all of the pictures.
I want to map the computed 3D coordinates of the feature points of the object to a slightly different object to align the shape of the second object to the first object.
I heard it is possible using cv::calibrateCamera(...) but I can't get quite through it...
Does someone have experiences with that kind of problem?
I was confronted with the same problem as you, in OpenCV. I had a stereo image pair and I wanted to computed the external parameters of the cameras and the world coordinates of all observed points. This problem has been treated here:
Berthold K. P. Horn. Relative orientation revisited. Berthold K. P. Horn. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 545 Technology ...
http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.4700
However, I wasn't able to find a suitable implementation of this problem (perhaps you will find one). Due to time limitations I did not have time to understand all the maths in this paper and implement it myself, so I came up with a quick-and-dirty solution that works for me. I will explain what I did to solve it:
Assuming we have two cameras, where the first camera has external parameters RT = Matx::eye(). Now make a guess about the the rotation R of the second camera. For every pair of image points observed in both images, we compute the directions of their corresponding rays in world coordinates and store them in a 2d-array dirs (EDIT: The internal camera parameters are assumed to be known). We can do this since we assume that we know the orientation of every camera. Now we build an overdetermined linear system AC = 0 where C is the centre of the second camera. I provide you with the function to compute A:
Mat buildA(Matx<double, 3, 3> &R, Array<Vec3d, 2> dirs)
{
CV_Assert(dirs.size(0) == 2);
int pointCount = dirs.size(1);
Mat A(pointCount, 3, DataType<double>::type);
Vec3d *a = (Vec3d *)A.data;
for (int i = 0; i < pointCount; i++)
{
a[i] = dirs(0, i).cross(toVec(R*dirs(1, i)));
double length = norm(a[i]);
if (length == 0.0)
{
CV_Assert(false);
}
else
{
a[i] *= (1.0/length);
}
}
return A;
}
Then calling cv::SVD::solveZ(A) will give you the least-squares solution of norm 1 to this system. This way, you obtain the rotation and translation of the second camera. However, since I just made a guess about the rotation of the second camera, I make several guesses about its rotation (parameterized using a 3x1 vector omega from which i compute the rotation matrix using cv::Rodrigues) and then I refine this guess by solving the system AC = 0 repetedly in a Levenberg-Marquardt optimizer with numeric jacobian. It works for me but it is a bit dirty, so you if you have time, I encourage you to implement what is explained in the paper.
EDIT:
Here is the routine in the Levenberg-Marquardt optimizer for evaluating the vector of residues:
void Stereo::eval(Mat &X, Mat &residues, Mat &weights)
{
Matx<double, 3, 3> R2Ref = getRot(X); // Map the 3x1 euler angle to a rotation matrix
Mat A = buildA(R2Ref, _dirs); // Compute the A matrix that measures the distance between ray pairs
Vec3d c;
Mat cMat(c, false);
SVD::solveZ(A, cMat); // Find the optimum camera centre of the second camera at distance 1 from the first camera
residues = A*cMat; // Compute the output vector whose length we are minimizing
weights.setTo(1.0);
}
By the way, I searched a little more on the internet and found some other code that could be useful for computing the relative orientation between cameras. I haven't tried any code yet, but it seems useful:
http://www9.in.tum.de/praktika/ppbv.WS02/doc/html/reference/cpp/toc_tools_stereo.html
http://lear.inrialpes.fr/people/triggs/src/
http://www.maths.lth.se/vision/downloads/
Are these static cameras which you wish to calibrate for future use as a stereo pair? In this case you would want to use the cv::stereoCalibrate() function. OpenCV contains some sample code, one of which is stereo_calib.cpp which may be worth investigating.