How could I to transform a pixel from a camera image plane to another camera image plane? - opencv

Two cameras , Calibration is done between them and both intrinsic and extrinsic matrices are obtained , I am able to get (U,V) of the first camera , How could i get (U,V) of the second camera ? What is the kind of transformation could be made ?
Positions of cameras is fixed

Homography is the way a two 2D planes could be related

Since these cameras are paralel to each other(i.e. stereo), y axis of a point(x,y) in the first image will remain the same in the second image, i.e . y' = y. Only x will change. ( y is vertical axis, x is horizontal).
There are some techniques to find x'. The easiest one is normalized cross correlation. Choose a window around the points, do normalized cross correlation. The result will be an array of width of the image.
Unless you are searching for a point in a smooth region, maximum value in your array (peak) is expected to be your matching point.
Alternatively, you can try SIFT/SURF feature but I am not expert on those. I only know there are functions you can use in Matlab (such as detectSURFfeatures).
Note that if you are using two different cameras, you have to calibrate both of them.

Related

stereo rectification with measured extrinsic parameters

I am trying to rectify two sequences of images for stereo matching. The usual approach of using stereoCalibrate() with a checkerboard pattern is not of use to me, since I am only working with the footage.
What I have is the correct calibration data of the individual cameras (camera matrix and distortion parameters) as well as measurements of their distance and angle between each other.
How can I construct the rotation matrix and translation vector needed for stereoRectify()?
The naive approach of using
Mat T = (Mat_<double>(3,1) << distance, 0, 0);
Mat R = (Mat_<double>(3,3) << cos(angle), 0, sin(angle), 0, 1, 0, -sin(angle), 0, cos(angle));
resulted in a heavily warped image. Do these matrices need to relate to a different origin point I am not aware of? Or do I need to convert the distance/angle value somehow to be dependent of pixelsize?
Any help would be appreciated.
It's not clear whether you have enough information about the camera poses to perform an accurate rectification.
Both T and R are measured in 3D, but in your case:
T is one-dimensional (along x-axis only), which means that you are confident that the two cameras are perfectly aligned along the other axes (in particular, you have less-than-1 pixel error on the y axis, ie a few microns by today's standards);
R leaves the Y-coordinates untouched. Thus, all you have is rotation around this axis, does it match your experimental setup ?
Finally, you need to check the consistency of the units that you are using for the translation and rotation to match with the units from the intrinsic data.
If it is feasible, you can check your results by finding some matching points between the two cameras and proceeding to a projective calibration: the accurate knowledge of the 3D position of the calibration points is required for metric reconstruction only. Other tasks rely on the essential or fundamental matrices, that can be computed from image-to-image point correspondences.
If intrinsics and extrinsics known, I recommend this method: http://link.springer.com/article/10.1007/s001380050120#page-1
It is easy to implement. Basically you rotate the right camera till both cameras have the same orientation, means both share a common R. Epipols are then transformed to the infinity and you have epipolar lines parallel to the image x-axis.
First row of the new R (x) is simply the baseline, e.g. the subtraction of both camera centers. Second row (y) the cross product of the baseline with the old left z-axis. Third row (z) equals cross product of the first two rows.
At last you need to calculate a 3x3 homography described in the above link and use warpPerspective() to get a rectified version.

finding the real world coordinates of an image point

I am searching lots of resources on internet for many days but i couldnt solve the problem.
I have a project in which i am supposed to detect the position of a circular object on a plane. Since on a plane, all i need is x and y position (not z) For this purpose i have chosen to go with image processing. The camera(single view, not stereo) position and orientation is fixed with respect to a reference coordinate system on the plane and are known
I have detected the image pixel coordinates of the centers of circles by using opencv. All i need is now to convert the coord. to real world.
http://www.packtpub.com/article/opencv-estimating-projective-relations-images
in this site and other sites as well, an homographic transformation is named as:
p = C[R|T]P; where P is real world coordinates and p is the pixel coord(in homographic coord). C is the camera matrix representing the intrinsic parameters, R is rotation matrix and T is the translational matrix. I have followed a tutorial on calibrating the camera on opencv(applied the cameraCalibration source file), i have 9 fine chessbordimages, and as an output i have the intrinsic camera matrix, and translational and rotational params of each of the image.
I have the 3x3 intrinsic camera matrix(focal lengths , and center pixels), and an 3x4 extrinsic matrix [R|T], in which R is the left 3x3 and T is the rigth 3x1. According to p = C[R|T]P formula, i assume that by multiplying these parameter matrices to the P(world) we get p(pixel). But what i need is to project the p(pixel) coord to P(world coordinates) on the ground plane.
I am studying electrical and electronics engineering. I did not take image processing or advanced linear algebra classes. As I remember from linear algebra course we can manipulate a transformation as P=[R|T]-1*C-1*p. However this is in euclidian coord system. I dont know such a thing is possible in hompographic. moreover 3x4 [R|T] Vector is not invertible. Moreover i dont know it is the correct way to go.
Intrinsic and extrinsic parameters are know, All i need is the real world project coordinate on the ground plane. Since point is on a plane, coordinates will be 2 dimensions(depth is not important, as an argument opposed single view geometry).Camera is fixed(position,orientation).How should i find real world coordinate of the point on an image captured by a camera(single view)?
EDIT
I have been reading "learning opencv" from Gary Bradski & Adrian Kaehler. On page 386 under Calibration->Homography section it is written: q = sMWQ where M is camera intrinsic matrix, W is 3x4 [R|T], S is an "up to" scale factor i assume related with homography concept, i dont know clearly.q is pixel cooord and Q is real coord. It is said in order to get real world coordinate(on the chessboard plane) of the coord of an object detected on image plane; Z=0 then also third column in W=0(axis rotation i assume), trimming these unnecessary parts; W is an 3x3 matrix. H=MW is an 3x3 homography matrix.Now we can invert homography matrix and left multiply with q to get Q=[X Y 1], where Z coord was trimmed.
I applied the mentioned algorithm. and I got some results that can not be in between the image corners(the image plane was parallel to the camera plane just in front of ~30 cm the camera, and i got results like 3000)(chessboard square sizes were entered in milimeters, so i assume outputted real world coordinates are again in milimeters). Anyway i am still trying stuff. By the way the results are previosuly very very large, but i divide all values in Q by third component of the Q to get (X,Y,1)
FINAL EDIT
I could not accomplish camera calibration methods. Anyway, I should have started with perspective projection and transform. This way i made very well estimations with a perspective transform between image plane and physical plane(having generated the transform by 4 pairs of corresponding coplanar points on the both planes). Then simply applied the transform on the image pixel points.
You said "i have the intrinsic camera matrix, and translational and rotational params of each of the image.” but these are translation and rotation from your camera to your chessboard. These have nothing to do with your circle. However if you really have translation and rotation matrices then getting 3D point is really easy.
Apply the inverse intrinsic matrix to your screen points in homogeneous notation: C-1*[u, v, 1], where u=col-w/2 and v=h/2-row, where col, row are image column and row and w, h are image width and height. As a result you will obtain 3d point with so-called camera normalized coordinates p = [x, y, z]T. All you need to do now is to subtract the translation and apply a transposed rotation: P=RT(p-T). The order of operations is inverse to the original that was rotate and then translate; note that transposed rotation does the inverse operation to original rotation but is much faster to calculate than R-1.

Find Distance between barcode and camera?

Is it possible to find the distance between the detected qr bar code (square) and the camera, if the size of the actual bar code and the (x,y) of all the corners of the bar code detected by the camera are known?
I want the method to work even if the camera is at an angle from the barcode.
I tried using a simple equation like f=d*z/D , where f is the local length of the camera, D is the size of the object, d is the width of the detected object in pixels, and z is the distance between the camera and the barcode. First, I calculate the focal length by using data from a known distance and then get the z values accordingly.
While the above method works pretty well but it has a lot of error if the camera is at an angle.
Is there is a better method to do this ?
Also, I can use only one camera, using two cameras is not an option.
Use your current formula (which you state works well) against the longest side and its opposite, then average the results.
Alternatively, just average the lengths of the longest side and its opposite. The relationships are all linear so you should end up with the same answer.
First you have to know the camera angle.
If you can not read that parameter from a device you could estimate that parameter by using other measures.
For example you know that a bar-code is rectangular. So by detecting it you could obtain four angles and from that estimate a homografy matrix. By knowing the homography matrix you could simplify your problem by just multiplying the coordinates with a homography inverse.
Homography matrix is wiedly used in camera calibration when a known pattern is presented such as chessboard.

Project 2d points in camera 1 image to camera 2 image after a stereo calibration

I am doing stereo calibration of two cameras (let's name them L and R) with opencv. I use 20 pairs of checkerboard images and compute the transformation of R with respect to L. What I want to do is use a new pair of images, compute the 2d checkerboard corners in image L, transform those points according to my calibration and draw the corresponding transformed points on image R with the hope that they will match the corners of the checkerboard in that image.
I tried the naive way of transforming the 2d points from [x,y] to [x,y,1], multiply by the 3x3 rotation matrix, add the rotation vector and then divide by z, but the result is wrong, so I guess it's not that simple (?)
Edit (to clarify some things):
The reason I want to do this is basically because I want to validate the stereo calibration on a new pair of images. So, I don't actually want to get a new 2d transformation between the two images, I want to check if the 3d transformation I have found is correct.
This is my setup:
I have the rotation and translation relating the two cameras (E), but I don't have rotations and translations of the object in relation to each camera (E_R, E_L).
Ideally what I would like to do:
Choose the 2d corners in image from camera L (in pixels e.g. [100,200] etc).
Do some kind of transformation on the 2d points based on matrix E that I have found.
Get the corresponding 2d points in image from camera R, draw them, and hopefully they match the actual corners!
The more I think about it though, the more I am convinced that this is wrong/can't be done.
What I am probably trying now:
Using the intrinsic parameters of the cameras (let's say I_R and I_L), solve 2 least squares systems to find E_R and E_L
Choose 2d corners in image from camera L.
Project those corners to their corresponding 3d points (3d_points_L).
Do: 3d_points_R = (E_L).inverse * E * E_R * 3d_points_L
Get the 2d_points_R from 3d_points_R and draw them.
I will update when I have something new
It is actually easy to do that but what you're making several mistakes. Remember after stereo calibration R and L relate the position and orientation of the second camera to the first camera in the first camera's 3D coordinate system. And also remember to find the 3D position of a point by a pair of cameras you need to triangulate the position. By setting the z component to 1 you're making two mistakes. First, most likely you have used the common OpenCV stereo calibration code and have given the distance between the corners of the checker board in cm. Hence, z=1 means 1 cm away from the center of camera, that's super close to the camera. Second, by setting the same z for all the points you are saying the checker board is perpendicular to the principal axis (aka optical axis, or principal ray), while most likely in your image that's not the case. So you're transforming some virtual 3D points first to the second camera's coordinate system and then projecting them onto the image plane.
If you want to transform just planar points then you can find the homography between the two cameras (OpenCV has the function) and use that.

Is it possible to obtain fronto-parallel view of image or camera position by 2d-3d points relation?

Is it possible to obtain front-parallel view of image or camera position by 2d-3d points relation using OpenCV?
For this I have intrinsic and extrinsic parameters. I have also 3d coordinates of set of control points (which lies in one plane) on image (relation 2d-3d).
In fact I need location and orientation of camera, but it is not difficult to find it if I can convert image to fronto-parallel view.
If it is not possible to do with OpenCV, are the other libraries which can solve this task?
Solution is based on the formulas in the OpenCV documentation Camera Calibration and 3D Reconstruction
Let's consider numerical form without distortion coefficient (in contrast with matrix form).
We have u and v.
It is easy to calculate x' and y'.
But x and y can not be calculate because we can choose any non-zero z.
Line in 3d corresponds to one point in 2d image.
To solve this we take two points for z=1 and z=2.
Then we find 2 points in 3d space which specify line (x1,y1,z1) and (x2,y2,z2).
Then we can apply R-1 to (x1,y1,z1) and (x2,y2,z2) which results in line determined by two points (X1, Y1, Z1) and (X1, Y1, Z1).
Since our control points lie in one plane (let plane is Z=0 for simplicity) we can find corresponding X and Y point which is a point in 3d.
After applying normalization from mm to pixels we obtain fronto-parallel image.
(If we have input image distorted we should undistort it first)

Resources