GIVEN:
A matrix (N, 3) build from a point cloud, i.e. a vector<Point3d>
std::vector<cv::Point3d> pcVector = ... // from somewhere
cv::Mat pcMat = cv::Mat(pcVector).reshape(1);
Thus having pcMat to be something like:
[ 0.1, 1.3, 4.5 ]
[ 3.1, 1.4, 7.6 ]
...
[ 1.1, 3.4, 4.1 ]
GOAL:
An efficient method to get a column of ones added to the matrix. That is to receive a matrix like the following from the above.
[ 0.1, 1.3, 4.5, 1.0 ]
[ 3.1, 1.4, 7.6, 1.0 ]
...
[ 1.1, 3.4, 4.1, 1.0 ]
CURRENTLY:
cv::Mat result(N, 4);
cv::Mat ones = cv::Mat_<double>::ones(N, 1);
cv::hconcat(pcMat, ones, result);
QUESTION:
This seems to be inefficient since a temporary matrix full of ones needs to be created. Are there any tricks to get this done faster?
Use cv::convertPointsToHomogeneous
The function converts points from Euclidean to homogeneous space by appending 1's to the tuple of point coordinates. That is, each point (x1, x2, ..., xn) is converted to (x1, x2, ..., xn, 1).
// given
std::vector<cv::Point3d> pcVector = ... // from somewhere
cv::Mat pcMat = cv::Mat(pcVector).reshape(1);
// I did not run this
cv::Mat dst;
cv::convertPointsToHomogeneous(pcMat, dst); // pcVector oughta work too
As always, these things will accept both Nx1 K-channel Mats as well as NxK 1-channel Mats. The result seems to always be Nx1 (K+1)-channel though.
And... you can imagine what convertPointsFromHomogeneous would do.
Related
I am having trouble converting the output of solvePnP to a camera position in Unity. I have spent the last several days going through documentation, reading every question I could find about it, and trying all different approaches but still I am stuck.
Here's what I can do. I have a 3D object in the real world with known 2D-3D coresponsandance points. And I can use these and solvePNP to get the rvec,tvec. Then I can plot the 2D points and then on top of those I can plot the points found with projectPoints. These points line up pretty closely.
(success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix,
dist_coeffs, flags=cv2.cv2.SOLVEPNP_ITERATIVE)
print("Rotation Vector:\n" + str(rotation_vector))
print("Translation Vector:\n" + str(translation_vector))
(repro_points, jacobian) = cv2.projectPoints(model_points, rotation_vector,
translation_vector, camera_matrix, dist_coeffs)
original_img = cv2.imread(r"{}".format("og.jpg"))
for p in repro_points:
cv2.circle(original_img, (int(p[0][0]), int(p[0][1])), 3, (255, 255, 255), -1)
print(str(p[0][0]) + "-" + str(p[0][1]))
for p in image_points:
cv2.circle(original_img, (int(p[0]), int(p[1])), 3, (0, 0, 255), -1)
cv2.imshow("og", original_img)
cv2.waitKey()
The code above will display my original image with the image_points and repro_points more or less lined up. Now I understand from reading that tvec and rvec cannot be used in Unity directly instead they represent the transformation matrix between the camera space and the object space such that you can translate points between the two spaces with the following formula.
Now I want to take what I solvepnp has given me, the rvec and tvec and determine how to rotate and translate my unity camera to line up in the same position as the original picture was taken. In Unity I start with the camera facing my object and with both of them at 0,0,0. So to try to keep things clear in my head I made another python script to try and convert Rvec and Tvec into unity position and rotations. This is based on advice I saw on opencv forum. First I get the rotation matrix from Rodrigues, transpose it and then swap the 2nd and 3rd row to swap y and z although I don't know if that is right. Then I rotate the matrix and finally negate it and multiply it by tvec to get the position. But this position does not line up with the real world.
def main(argv):
print("Start")
rvec = np.array([0.11160132, -2.67532422, -0.55994949])
rvec = rvec.reshape(3,1)
print("RVEC")
print(rvec)
tvec = np.array([0.0896325, -0.14819345, -0.36882839])
tvec = tvec.reshape(3,1)
print("TVEC")
print(tvec)
rotmat,_ = cv2.Rodrigues(rvec)
print("Rotation Matrix:")
print(rotmat)
#trans_mat = cv2.hconcat((rotmat, tvec))
#print("Transformation Matrix:")
#print(trans_mat)
#transpose the transformation matrix
transposed_mat = np.transpose(rotmat)
print("Transposed Mat: ")
print(transposed_mat)
#swap rows 1 & 2
swap = np.empty([3, 3])
swap[0] = rotmat[0]
swap[1] = rotmat[2]
swap[2] = rotmat[1]
print("SWAP")
print(swap)
R = np.rot90(swap)
#this is supposed to be the rotation matrix for the camera
print("R:")
print(R)
#translation matrix
#they say negative matrix is 1's on diagonals do they mean idenity matrix
#negativematrix = np.identity(3)
position = np.matmul(-R, tvec);
print("Position: ")
print(position)
The output of this code is:
Start
RVEC
[[ 0.11160132]
[-2.67532422]
[-0.55994949]]
TVEC
[[ 0.0896325 ]
[-0.14819345]
[-0.36882839]]
Rotation Matrix:
[[-0.91550667 0.00429232 -0.4022799 ]
[-0.15739624 0.91641547 0.36797976]
[ 0.37023502 0.40020526 -0.83830888]]
Transposed Mat:
[[-0.91550667 -0.15739624 0.37023502]
[ 0.00429232 0.91641547 0.40020526]
[-0.4022799 0.36797976 -0.83830888]]
SWAP
[[-0.91550667 0.00429232 -0.4022799 ]
[ 0.37023502 0.40020526 -0.83830888]
[-0.15739624 0.91641547 0.36797976]]
R:
[[-0.4022799 -0.83830888 0.36797976]
[ 0.00429232 0.40020526 0.91641547]
[-0.91550667 0.37023502 -0.15739624]]
Position:
[[0.04754685]
[0.39692311]
[0.07887335]]
If I swap y and z here I could sort of see it being close but it is still not right.
To get the rotation I have been doing the following. And I also tried subtracting 180 from the y axis since in Unity my camera and object are facing one another. But this is not coming out right for me either.
rotation_mat, jacobian = cv2.Rodrigues(rotation_vector)
pose_mat = cv2.hconcat((rotation_mat, translation_vector))
tr = -np.matrix(rotation_mat).T * np.matrix(translation_vector)
print("TR_TR")
print(tr)
_, _, _, _, _, _, euler_angles = cv2.decomposeProjectionMatrix(pose_mat)
print("Euler:")
print(euler_angles)
I was feeling good this morning when I got the re-projected points to line up, but now I feel like I'm stuck in the mud. Any help is appreciated. Thank you.
I had a similar problem when I was writing an AR application for Unity. I remember spending several days too, until I figured it out. I had a DLL written in C++ OpenCV which took an image from a webcam, detected an object in the image and found its pose. A front end written in C# Unity would call the DLL's functions and update the position and orientation of a 3D model accordingly.
Simplified versions of the C++ and Unity code are:
void getCurrentPose(float* outR, float* outT)
{
cv::Mat Rvec;
cv::Mat Tvec;
cv::solvePnP(g_modelPoints, g_imagePoints, g_cameraMatrix, g_distortionParams, Rvec, Tvec, false, cv::SOLVEPNP_ITERATIVE);
cv::Matx33d R;
cv::Rodrigues(Rvec, R);
cv::Point3d T;
T.x = Tvec.at<double>(0, 0);
T.y = Tvec.at<double>(1, 0);
T.z = Tvec.at<double>(2, 0);
// Uncomment to return the camera transformation instead of model transformation
/* const cv::Matx33d Rinv = R.t();
const cv::Point3d Tinv = -R.t() * T;
R = Rinv;
T = Tinv;
*/
for(int i = 0; i < 9; i++)
{
outR[i] = (float)R.val[i];
}
outT[0] = (float)T.x;
outT[1] = (float)T.y;
outT[2] = (float)T.z;
}
and
public class Vision : MonoBehaviour {
GameObject model = null;
void Start() {
model = GameObject.Find("Model");
}
void Update() {
float[] r = new float[9];
float[] t = new float[3];
dll_getCurrentPose(r, t); // Get object pose from DLL
Matrix4x4 R = new Matrix4x4();
R.SetRow(0, new Vector4(r[0], r[1], r[2], 0));
R.SetRow(1, new Vector4(r[3], r[4], r[5], 0));
R.SetRow(2, new Vector4(r[6], r[7], r[8], 0));
R.SetRow(3, new Vector4(0, 0, 0, 1));
Quaternion Q = R.rotation;
model.transform.SetPositionAndRotation(
new Vector3(t[0], -t[1], t[2]),
new Quaternion(-Q.x, Q.y, -Q.z, Q.w));
}
}
It should be easy to port in Python and try it. Since you want the camera transformation, you should probably uncomment the commented lines in the C++ code. I don't know if this code will work for your case but maybe it is worth trying. Unfortunately I don't have Unity installed anymore to try things so I can't make any other suggestions.
I am trying to create a 2D perspective transform matrix from individual components like translation, rotation, scale, shear. But at the end the matrix is not producing a true perspective effect like the image below. I think I am missing some component in the code that I wrote to create the matrix. Could some one help me add the missing components and their formulation in the below function? I have used opencv library for my code
cv::Mat getPerspMatrix2D( double rz, double s, double tx, double ty ,double shx, double shy)
{
cv::Mat R = (cv::Mat_<double>(3,3) <<
cos(rz), -sin(rz), 0,
sin(rz), cos(rz), 0,
0, 0, 1);
cv::Mat S = (cv::Mat_<double>(3,3) <<
s, 0, 0,
0, s, 0,
0, 0, 1);
cv::Mat Sh = (cv::Mat_<double>(3,3) <<
1, shx, 0,
shy, 1, 0,
0, 0, 1);
cv::Mat T = (cv::Mat_<double>(3,3) <<
1, 0, tx,
0, 1, ty,
0, 0, 1);
return T * Sh * S * R;
}
Keywords are Homography and 8DOF. Taken from 1 and 2 there exists two coefficients for perspective transformation. But it needs a 2nd step to calculate it. I'm not familiar with OpenCV but I'm hoping to answer your question a bit late in a basically way ;-)
Step 1
You can imagine lx describes a vanishing point on the x axis. The image shows a31=lx=1. lx=100 is less transformation. For lx=0 the position is infinite far means no perspective transform = identity matrix.
[1 0 0]
PL = [0 1 0]
[lx ly 1]
lx/ly are perspective foreshortening parameters
Step 2
When you apply a right hand matrix multiplication P x [u; v; 1] you will recognize that the last value in the result is sometimes other than 1. For affine transformation it is always 1 for perspective projection not. In the 2nd step the result is scaled to make the last coefficient 1. This is a part of the effect.
Your Example Image
Image' = P4 x P3 x P2 x P1 x Image
I would translate the center of the blue rectangle to the origin tx=-w/2 and ty=-h/2 = P1.
Apply projective projection with ly = h (to make both sides at an angle)
Eventually translate back that all point are located in one quadrant
Eventually scale to desired size
Step 2 from the perspective projection can be done after 2.) or at the end.
I try to use cvCalibrateCamera2, but I get error that rotation matrix is not properly defined:
...calibration.cpp:1495: error: (-5) the output array of rotation vectors must be 3-channel 1xn or nx1 array or 1-channel nx3 or nx9 array, where n is the number of views
I have already tried all possibilities from that info but I still get this error.
My code:
CvMat *object_points = cvCreateMat((int)pp.object_points.size(), 1, CV_32FC3);
CvMat *image_points = cvCreateMat((int)pp.image_points.size(), 1, CV_32FC2);
const CvMat point_counts = cvMat((int)pp.point_counts.size(), 1, CV_32SC1, &pp.point_counts[0]);
for (size_t i=0; i<pp.object_points.size(); i++)
{
object_points->data.fl[i*3+0] = (float)pp.object_points[i].x;
object_points->data.fl[i*3+1] = (float)pp.object_points[i].y;
object_points->data.fl[i*3+2] = (float)pp.object_points[i].z;
image_points->data.fl[i*2+0] = (float)pp.image_points[i].x;
image_points->data.fl[i*2+1] = (float)pp.image_points[i].y;
}
CvMat* tempR = cvCreateMat(1, 3, CV_32F);
cvCalibrateCamera2(object_points, image_points, &point_counts,
cvSize(pp.width, pp.height), camera->m_calib_K,
camera->m_calib_D, tempR, &tempData->m_calib_T,
CV_CALIB_USE_INTRINSIC_GUESS)
// camera->calib_T is defined as:
// double m_calib_T_data[3];
// cvMat(3, 1, CV_64F, camera->m_calib_T_data);
I thought that rotation matrix used in cvCalibrateCamera2 should be 1x3 (then I want to use Rodrigues function to get 3x3 matrix) but it doesn't work as any other combination mentioned in error.
Any ideas?
And I use opencv 2.4.0 (maybe there is bug in that method, but for some reasons I can't use later version of opencv)
I think the statement is clear. I am not confident with C# but I know it requires a strong initialization.
The problem in line
CvMat* tempR = cvCreateMat(1, 3, CV_32F);
is that tempR should have a line 1x3 for every N objects point you use. In this sense, the statement becomes clear
...calibration.cpp:1495: error: (-5) the output array of rotation
vectors must be 3-channel 1xn or nx1 array or 1-channel nx3 or nx9
array, where n is the number of views
You must create a tempR like that (more or less, I do not know how to calculate N in C#)
CvMat* tempR = cvCreateMat(N, 3, CV_32F);
Try to extract N from dimensions of object.point.size. If it does not work, try image.point.size
I am learning HLSL shading, and in my vertex shader, I have code like this:
VS_OUTPUT vs_main(
float4 inPos: POSITION,
float2 Txr1: TEXCOORD0 )
{
VS_OUTPUT Output;
Output.Position = mul( inPos, matViewProjection);
Output.Tex1 = Txr1;
return( Output );
}
It works fine. But when I was typing codes from the book, the code was like this:
VS_OUTPUT vs_main(
float4 inPos: POSITION,
float2 Txr1: TEXCOORD0 )
{
VS_OUTPUT Output;
Output.Position = mul( matViewProjection, inPos );
Output.Tex1 = Txr1;
return( Output );
}
At first I thought maybe the order does not matter. However, when I exchanged the parameters in the mul function in my code, it does not work. I don't know why.
BTW, I am using RenderMonkey.
This issue is known as pre- vs. post-multiplication.
By convention, matrices produced by D3DX are stored in row-major order. To produce proper results you have to pre-multiply. That means that for matViewProjection to transform the vector inPos into clip-space inPos should appear on the l-hand side (first parameter).
Order absolutely matters, matrix multiplication is not commutative. However, pre-multiplying a matrix is the same as post-multiplying the transpose of the same matrix. To put this another way, if you were using the same matrix but stored in column-major order (transposed) then you would want to swap the operands.
Thus (vector on r-hand side -- also known as post-multiplication):
[ 0, 0, 0, m41 ] [ x ]
[ 0, 0, 0, m42 ] * [ y ]
[ 0, 0, 0, m43 ] [ z ]
[ 0, 0, 0, m44 ] [ w ]
When the vector appears on the r-hand side it is interpreted as a column-vector.
Is equivalent to (vector on l-hand side -- also known as pre-multiplication):
[ x, y, z, w ] * [ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ m41, m42, m43, m44 ]
When the vector appears on the l-hand side, it is interpreted as a row-vector.
There is no universally correct side, it depends on how the matrix is represented.
Not sure if this is caused by matrix order (row major or column major), take a look at Type modifier and mul function
Updated:
I have test this in my project, use the keyword row_major make the second case work.
row_major matrix matViewProjection
mul(matViewProjection, inPos);
I have been trying to achieve something which should pretty trivial and is trivial in Matlab.
Using methods of OpenCV, I want to simply achieve something such as:
cv::Mat sample = [4 5 6; 4 2 5; 1 4 2];
sample = 5*sample;
After which sample should just be:
[20 25 30; 20 10 25; 5 20 10]
I have tried scaleAdd, Mul, Multiply and neither allow a scalar multiplier and require a matrix of the same "size and type". In this scenario I could create a Matrix of Ones and then use the scale parameter but that seems so very extraneous
Any direct simple method would be great!
OpenCV does in fact support multiplication by a scalar value with overloaded operator*. You might need to initialize the matrix correctly, though.
float data[] = {1 ,2, 3,
4, 5, 6,
7, 8, 9};
cv::Mat m(3, 3, CV_32FC1, data);
m = 3*m; // This works just fine
If you are mainly interested in mathematical operations, cv::Matx is a little easier to work with:
cv::Matx33f mx(1,2,3,
4,5,6,
7,8,9);
mx *= 4; // This works too
For java there is no operator overloading, but the Mat object provides the functionality with a convertTo method.
Mat dst= new Mat(src.rows(),src.cols(),src.type());
src.convertTo(dst,-1,scale,offset);
Doc on this method is here
For big Mats you should use forEach.
If C++11 is available:
m.forEach<double>([&](double& element, const int position[]) -> void
{
element *= 5;
}
);
something like this.
Mat m = (Mat_<float>(3, 3)<<
1, 2, 3,
4, 5, 6,
7, 8, 9)*5;
Mat A = //data;//source Matrix
Mat B;//destination Matrix
Scalar alpha = new Scalar(5)//factor
Core.multiply(A,alpha,b);