I'm trying to implement a camera-model in Delphi/OpenGL after the description given in OpenGL SuperBible. The camera has a position, a forward vector and a up vector. Translating the camera seems to work OK, but when I try to rotate the camera according to the forward vector, I loose sight of my object.
function TCamera.GetCameraOrientation: TMatrix4f;
var
x, z: T3DVector;
begin
z := T3DVector.Create(-FForward.X, -FForward.y, -FForward.z);
x := T3DVector.Cross(z, FUp);
result[0, 0] := x.X;
result[1, 0] := x.Y;
result[2, 0] := x.Z;
result[3, 0] := 0;
result[0, 1] := FUp.X;
result[1, 1] := FUp.Y;
result[2, 1] := FUp.Z;
result[3, 1] := 0;
result[0, 2] := z.x;
result[1, 2] := z.y;
result[2, 2] := z.z;
result[3, 2] := 0;
result[0, 3] := 0;
result[1, 3] := 0;
result[2, 3] := 0;
result[3, 3] := 1;
end;
procedure TCamera.ApplyTransformation;
var
cameraOrient: TMatrix4f;
a, b, c: TMatrix4f;
begin
cameraOrient := getcameraOrientation;
glMultMatrixf(#cameraOrient);
glTranslatef(-FPosition.x, -FPosition.y, -FPosition.z);
end;
Given the position (0, 0, -15), forward vector (0 0 1) and up vector (0 1 0), I expected to get a identity-matrix from the getCameraOrientation-method, but instead I get
(1, 0, 0, 0)
(0, 1, 0, 0)
(0, 0, -1, 0)
(0, 0, 0, 1)
If I change the forward vector to (0 0 -1) I get the following matrix:
(-1, 0, 0, 0)
( 0, 1, 0, 0)
( 0, 0, 1, 0)
( 0, 0, 0, 1)
After the call to glMultMatrix( ) and glTranslate( ), glGet( ) gives me the following GL_MODELVIEW_MATRIX:
( 1, 0, 0, 0)
( 0, 1, 0, 0)
( 0, 0, -1, 0)
( 0, 0, 15, 1)
I would have expected the 15 to be in column 4, row 3, not column 3, row 4.
Can anyone see where I get this wrong?
EDIT: The original code from OpenGL SuperBible:
inline void GetCameraOrientation(M3DMatrix44f m)
{
M3DVector3f x, z;
// Make rotation matrix
// Z vector is reversed
z[0] = -vForward[0];
z[1] = -vForward[1];
z[2] = -vForward[2];
// X vector = Y cross Z
m3dCrossProduct(x, vUp, z);
// Matrix has no translation information and is
// transposed.... (rows instead of columns)
#define M(row,col) m[col*4+row]
M(0, 0) = x[0];
M(0, 1) = x[1];
M(0, 2) = x[2];
M(0, 3) = 0.0;
M(1, 0) = vUp[0];
M(1, 1) = vUp[1];
M(1, 2) = vUp[2];
M(1, 3) = 0.0;
M(2, 0) = z[0];
M(2, 1) = z[1];
M(2, 2) = z[2];
M(2, 3) = 0.0;
M(3, 0) = 0.0;
M(3, 1) = 0.0;
M(3, 2) = 0.0;
M(3, 3) = 1.0;
#undef M
}
inline void ApplyCameraTransform(bool bRotOnly = false)
{
M3DMatrix44f m;
GetCameraOrientation(m);
// Camera Transform
glMultMatrixf(m);
// If Rotation only, then do not do the translation
if(!bRotOnly)
glTranslatef(-vOrigin[0], -vOrigin[1], -vOrigin[2]);
}
Given your code of getcameraOrientation the resulting matrix is quite obvious: forward = (0, 0, 1) yields z = (0, 0, -1), which corresponds to the 3rd line of the matrix. The cross product of z = (0, 0, -1) and FUp = (0, 1, 0) results in x = (1, 0, 0), which corresponds to the first line of the matrix. The second line is just a copy of FUp and the 4th line is just fixed.
I actually don't understand what you want to achieve, but when you rotate the camera you clearly loose sight of your object. In the real world if you look at a point and turn your head - it's the same thing. Have you tried to reverse the order of translation and rotation?
Identity matrix
I'm not sure why the SuperBible suggests using (-FForward.X, -FForward.y, -FForward.z) to create your Z vector. If you take out the minus signs then you will get the identity matrix that you expect when your forward vector is (0, 0, 1).
If you want to keep the minus signs, and you want a forward vector of (0, 0, -1) to produce an identity matrix, then you need to change your cross product from Cross(z, FUp) to Cross(FUp, z), because OpenGL uses a right-handed coordinate system. See Cross product.
15 in the wrong spot
I agree with you that I would expect a translation matrix to look like this:
(1, 0, 0, x)
(0, 1, 0, y)
(0, 0, 1, z)
(0, 0, 0, 1)
Note though that OpenGL stores its matrix in column order not row order, so when you glGet the modelview matrix it will come out in this order:
(m[0], m[4], m[8], m[12])
(m[1], m[5], m[9], m[13])
(m[2], m[6], m[10], m[14])
(m[3], m[7], m[11], m[15])
If you thought that it was in row order then that may be what is causing the confusion.
Related
I'm using AVAssetExportSession to export videos in an iOS app. To render the videos in their correct orientation, I'm using AVAssetTrack's preferredTransform. For some source videos, this property seems to have a wrong value, and the video appears offset or completely black in the result. How can I fix this?
The preferredTransform is a CGAffineTransform. The properties a, b, c, d are concatenations of reflection and rotation matrices, and the properties tx and ty describe a translation. In all cases that I observed with an incorrect preferredTransform, the reflection/rotation part appeared to be correct, and only the translation part contained wrong values. A reliable fix seems to be to inspect a, b, c, d (eight cases in total, each corresponding to a case in UIImageOrientation) and update tx and ty accordingly:
extension AVAssetTrack {
var fixedPreferredTransform: CGAffineTransform {
var t = preferredTransform
switch(t.a, t.b, t.c, t.d) {
case (1, 0, 0, 1):
t.tx = 0
t.ty = 0
case (1, 0, 0, -1):
t.tx = 0
t.ty = naturalSize.height
case (-1, 0, 0, 1):
t.tx = naturalSize.width
t.ty = 0
case (-1, 0, 0, -1):
t.tx = naturalSize.width
t.ty = naturalSize.height
case (0, -1, 1, 0):
t.tx = 0
t.ty = naturalSize.width
case (0, 1, -1, 0):
t.tx = naturalSize.height
t.ty = 0
case (0, 1, 1, 0):
t.tx = 0
t.ty = 0
case (0, -1, -1, 0):
t.tx = naturalSize.height
t.ty = naturalSize.width
default:
break
}
return t
}
}
I ended up doing something slightly more robust I think, I nullified the transform based on where it would end up:
auto naturalFrame = CGRectMake(0, 0, naturalSize.width, naturalSize.height);
auto preferredFrame = CGRectApplyAffineTransform(naturalFrame, preferredTransform);
preferredTransform.tx -= preferredFrame.origin.x;
preferredTransform.ty -= preferredFrame.origin.y;
Note that you can't just apply the transform on (0, 0) since CGRect.origin takes into account things like flipping.
I know that to implement the following
I would use this code:
Mat o_k;
Mat Lapl;
double lambda;
Laplacian(o_k, Lapl, o_k.depth(), 1, 1, 0, BORDER_REFLECT);
Lapl = 1.0 - 2.0*lambda*Lapl;
However, I am trying to implement in OpenCV the following equation:
I know the div, or divergence, term would be like this, right?
int ksize = parser.get<int>("ksize");
int scale = parser.get<int>("scale");
int delta = parser.get<int>("delta");
Sobel(res, sobelx, CV_64F, 1, 0, ksize, scale, delta, BORDER_DEFAULT);
Sobel(res, sobely, CV_64F, 0, 1, ksize, scale, delta, BORDER_DEFAULT);
div = sobelx + sobely;
Where res is the result of the term in parenthesis. But how I get the term in parenthesis?
Or am I doing this wrong? Would div above actually be equal to the gradient of res? If so, then how do I get the divergence?
EDIT:
According to this link, the magnitude can also be computed as mag = abs(x) + abs(y): https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/sobel_derivatives/sobel_derivatives.html#sobel-derivatives
And since the div of a gradient is the Laplacian, would the below code be equivalent to the 2nd equation?
Sobel(res, sobelx, CV_64F, 1, 0, ksize, scale, delta, BORDER_DEFAULT);
Sobel(res, sobely, CV_64F, 0, 1, ksize, scale, delta, BORDER_DEFAULT);
convertScaleAbs( sobelx, abs_grad_x );
convertScaleAbs( sobely, abs_grad_y );
/// Total Gradient (approximate)
Mat mag;
addWeighted( abs_grad_x, 1, abs_grad_y, 1, 0, mag);
Laplacian(o_k, Lapl, o_k.depth(), 1, 1, 0, BORDER_REFLECT);
Mat top;
top = lambda * Lapl;
Mat result;
divide(top, mag, result);
Apologies if this seems trivial - relatively new to openCV.
Essentially, I'm trying to create a function that can take in a camera's image, the known world coordinates of that image, and the world coordinates of some other point 2, and then transform the camera's image to what it would look like if the camera was at point 2. From my understanding, the best way to tackle this is using a homography transformation using the warpPerspective tool.
The experiment is being done inside the Unreal Game simulation engine. Right now, I essentially read the data from the camera, and add a set transformation to the image. However, I seem to be doing something wrong as the image is looking something like this (original image first then distorted image):
Original Image
Distorted Image
This is the current code I have. Basically, it reads in the texture from Unreal engine, and then gets the individual pixel values and puts them into the openCV Mat. Then I try and apply my warpPerspective transformation. Interestingly, if I just try a simple warpAffine transformation (rotation), it works fine. I have seen this questions: Opencv virtually camera rotating/translating for bird's eye view, but I cannot figure out what I am doing wrong vs. their solution. I would really appreciate any help or guidance any of you may have. Thanks in advance!
ROSCamTextureRenderTargetRes->ReadPixels(ImageData);
cv::Mat image_data_matrix(TexHeight, TexWidth, CV_8UC3);
cv::Mat warp_dst, warp_rotate_dst;
int currCol = 0;
int currRow = 0;
cv::Vec3b* pixel_left = image_data_matrix.ptr<cv::Vec3b>(currRow);
for (auto color : ImageData)
{
pixel_left[currCol][2] = color.R;
pixel_left[currCol][1] = color.G;
pixel_left[currCol][0] = color.B;
currCol++;
if (currCol == TexWidth)
{
currRow++;
currCol = 0;
pixel_left = image_data_matrix.ptr<cv::Vec3b>(currRow);
}
}
warp_dst = cv::Mat(image_data_matrix.rows, image_data_matrix.cols, image_data_matrix.type());
double rotX = (45 - 90)*PI / 180;
double rotY = (90 - 90)*PI / 180;
double rotZ = (90 - 90)*PI / 180;
cv::Mat A1 = (cv::Mat_<float>(4, 3) <<
1, 0, (-1)*TexWidth / 2,
0, 1, (-1)*TexHeight / 2,
0, 0, 0,
0, 0, 1);
// Rotation matrices Rx, Ry, Rz
cv::Mat RX = (cv::Mat_<float>(4, 4) <<
1, 0, 0, 0,
0, cos(rotX), (-1)*sin(rotX), 0,
0, sin(rotX), cos(rotX), 0,
0, 0, 0, 1);
cv::Mat RY = (cv::Mat_<float>(4, 4) <<
cos(rotY), 0, (-1)*sin(rotY), 0,
0, 1, 0, 0,
sin(rotY), 0, cos(rotY), 0,
0, 0, 0, 1);
cv::Mat RZ = (cv::Mat_<float>(4, 4) <<
cos(rotZ), (-1)*sin(rotZ), 0, 0,
sin(rotZ), cos(rotZ), 0, 0,
0, 0, 1, 0,
0, 0, 0, 1);
// R - rotation matrix
cv::Mat R = RX * RY * RZ;
// T - translation matrix
cv::Mat T = (cv::Mat_<float>(4, 4) <<
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, dist,
0, 0, 0, 1);
// K - intrinsic matrix
cv::Mat K = (cv::Mat_<float>(3, 4) <<
12.5, 0, TexHeight / 2, 0,
0, 12.5, TexWidth / 2, 0,
0, 0, 1, 0
);
cv::Mat warp_mat = K * (T * (R * A1));
//warp_mat = cv::getRotationMatrix2D(srcTri[0], 43.0, 1);
//cv::warpAffine(image_data_matrix, warp_dst, warp_mat, warp_dst.size());
cv::warpPerspective(image_data_matrix, warp_dst, warp_mat, image_data_matrix.size(), CV_INTER_CUBIC | CV_WARP_INVERSE_MAP);
cv::imshow("distort", warp_dst);
cv::imshow("imaage", image_data_matrix)
Given a 3 x 3 rotation matrix,R, and a 3 x 1 translation matrix,T, I am wondering how to multiply the T and R matrices to an image?
Lets say the Iplimage img is 640 x 480.
What I want to do is R*(T*img).
I was thinking of using cvGemm, but that didn't work.
The function you are searching for is probably warpPerspective() : this is a use case...
// Projection 2D -> 3D matrix
Mat A1 = (Mat_<double>(4,3) <<
1, 0, -w/2,
0, 1, -h/2,
0, 0, 0,
0, 0, 1);
// Rotation matrices around the X axis
Mat R = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, cos(alpha), -sin(alpha), 0,
0, sin(alpha), cos(alpha), 0,
0, 0, 0, 1);
// Translation matrix on the Z axis
Mat T = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, dist,
0, 0, 0, 1);
// Camera Intrisecs matrix 3D -> 2D
Mat A2 = (Mat_<double>(3,4) <<
f, 0, w/2, 0,
0, f, h/2, 0,
0, 0, 1, 0);
Mat transfo = A2 * (T * (R * A1));
Mat source;
Mat destination;
warpPerspective(source, destination, transfo, source.size(), INTER_CUBIC | WARP_INVERSE_MAP);
I hope it could help you,
Julien
PS : I gave the example with a projection from 2D to 3D but you can use directly transfo = T* R;
I've a calibrated camera where I exactly know the intrinsic and extrinsic data. Also the height of the camera is known. Now I want to virtually rotate the camera for getting a Bird's eye view, such that I can build the Homography matrix with the three rotation angles and the translation.
I know that 2 points can be transformed from one image to another via Homography as
x=K*(R-t*n/d)K^-1 * x'
there are a few things I'd like to know now:
if I want to bring back the image coordinate in ccs, I have to multiply it with K^-1, right? As Image coordinate I use (x',y',1) ?
Then I need to built a rotation matrix for rotating the ccs...but which convention should I use? And how do I know how to set up my WCS?
The next thing is the normal and the distance. Is it right just to take three points lying on the ground and compute the normal out of them? and is the distance then the camera height?
Also I'd like to know how I can change the height of the virtually looking bird view camera, such that I can say I want to see the ground plane from 3 meters height. How can I use the unit "meter" in the translation and homography Matrix?
So far for now, it would be great if someone could enlighten and help me. And please don't suggest generating the bird view with "getperspective", I ve already tried that but this way is not suitable for me.
Senna
That is the code i would advise (it's one of mine), to my mind it answers a lot of your questions,
If you want the distance, i would precise that it is in the Z matrix, the (4,3) coefficient.
Hope it will help you...
Mat source=imread("Whatyouwant.jpg");
int alpha_=90., beta_=90., gamma_=90.;
int f_ = 500, dist_ = 500;
Mat destination;
string wndname1 = getFormatWindowName("Source: ");
string wndname2 = getFormatWindowName("WarpPerspective: ");
string tbarname1 = "Alpha";
string tbarname2 = "Beta";
string tbarname3 = "Gamma";
string tbarname4 = "f";
string tbarname5 = "Distance";
namedWindow(wndname1, 1);
namedWindow(wndname2, 1);
createTrackbar(tbarname1, wndname2, &alpha_, 180);
createTrackbar(tbarname2, wndname2, &beta_, 180);
createTrackbar(tbarname3, wndname2, &gamma_, 180);
createTrackbar(tbarname4, wndname2, &f_, 2000);
createTrackbar(tbarname5, wndname2, &dist_, 2000);
imshow(wndname1, source);
while(true) {
double f, dist;
double alpha, beta, gamma;
alpha = ((double)alpha_ - 90.)*PI/180;
beta = ((double)beta_ - 90.)*PI/180;
gamma = ((double)gamma_ - 90.)*PI/180;
f = (double) f_;
dist = (double) dist_;
Size taille = source.size();
double w = (double)taille.width, h = (double)taille.height;
// Projection 2D -> 3D matrix
Mat A1 = (Mat_<double>(4,3) <<
1, 0, -w/2,
0, 1, -h/2,
0, 0, 0,
0, 0, 1);
// Rotation matrices around the X,Y,Z axis
Mat RX = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, cos(alpha), -sin(alpha), 0,
0, sin(alpha), cos(alpha), 0,
0, 0, 0, 1);
Mat RY = (Mat_<double>(4, 4) <<
cos(beta), 0, -sin(beta), 0,
0, 1, 0, 0,
sin(beta), 0, cos(beta), 0,
0, 0, 0, 1);
Mat RZ = (Mat_<double>(4, 4) <<
cos(gamma), -sin(gamma), 0, 0,
sin(gamma), cos(gamma), 0, 0,
0, 0, 1, 0,
0, 0, 0, 1);
// Composed rotation matrix with (RX,RY,RZ)
Mat R = RX * RY * RZ;
// Translation matrix on the Z axis change dist will change the height
Mat T = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, dist,
0, 0, 0, 1);
// Camera Intrisecs matrix 3D -> 2D
Mat A2 = (Mat_<double>(3,4) <<
f, 0, w/2, 0,
0, f, h/2, 0,
0, 0, 1, 0);
// Final and overall transformation matrix
Mat transfo = A2 * (T * (R * A1));
// Apply matrix transformation
warpPerspective(source, destination, transfo, taille, INTER_CUBIC | WARP_INVERSE_MAP);
imshow(wndname2, destination);
waitKey(30);
}
This code works for me but I don't know why the Roll and Pitch angles are exchanged. When I change "alpha", the image is warped in pitch and when I change "beta" the image in warped in roll. So, I changed my rotation matrix, as can be seen below.
Also, the RY has a signal error. You can check Ry at: http://en.wikipedia.org/wiki/Rotation_matrix.
The rotation metrix I use:
Mat RX = (Mat_<double>(4, 4) <<
1, 0, 0, 0,
0, cos(beta), -sin(beta), 0,
0, sin(beta), cos(beta), 0,
0, 0, 0, 1);
Mat RY = (Mat_<double>(4, 4) <<
cos(alpha), 0, sin(alpha), 0,
0, 1, 0, 0,
-sin(alpha), 0, cos(alpha), 0,
0, 0, 0, 1);
Mat RZ = (Mat_<double>(4, 4) <<
cos(gamma), -sin(gamma), 0, 0,
sin(gamma), cos(gamma), 0, 0,
0, 0, 1, 0,
0, 0, 0, 1);
Regards