Rectifying images on opencv with intrinsic and extrinsic parameters already found

Rectifying images on opencv with intrinsic and extrinsic parameters already found - opencv

I ran Bouguet's calibration toolbox (http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/example.html) on Matlab and have the parameters from the calibration (intrinsic [focal lengths and principal point offsets] and extrinsic [rotation and translations of the checkerboard with respect to the camera]).
Feature coordinate points of the checkerboard on my images are also known.
I want to obtain rectified images so that I can make a disparity map (for which I have the code for) from each pair of rectified images.
How can I go about doing this?

The documentation is here. At the end, it reads "Add these values as constants to your program, call the initUndistortRectifyMap and the remap function to remove distortion and enjoy distortion free inputs with cheap and low quality cameras".
Once your cameras are rectified, you may be interested in the class StereoVar or StereoBM to get the disparity map. Use reprojectImageTo3D once you are done if you want to check that your results look fine in 3D.

If fully calibrated use: http://link.springer.com/article/10.1007/s001380050120#page-1 Both cameras have the same orientation, share the same R.
First row of the new R is the baseline = subtraction of both camera centers. Second row cross product of baseline with old left z-axis (3 row R_old_left). Third row cross product of the first two rows.
Warp images with H_left=P_new(1:3,1:3)*P_old_left(1:3,1:3)^-1 and H_right=P_new(1:3,1:3)*P_old_right(1:3,1:3)^-1.
Rectified left pixel coordinates are u_new=(h11*u+h12*v+h13)/(h31*u+h32*v+h33), v=(h21*u+h22*v+h23)/(h31*u+h32*v+h33), same with the right ones

Related

How could I to transform a pixel from a camera image plane to another camera image plane?

Two cameras , Calibration is done between them and both intrinsic and extrinsic matrices are obtained , I am able to get (U,V) of the first camera , How could i get (U,V) of the second camera ? What is the kind of transformation could be made ?
Positions of cameras is fixed

Homography is the way a two 2D planes could be related

Since these cameras are paralel to each other(i.e. stereo), y axis of a point(x,y) in the first image will remain the same in the second image, i.e . y' = y. Only x will change. ( y is vertical axis, x is horizontal).
There are some techniques to find x'. The easiest one is normalized cross correlation. Choose a window around the points, do normalized cross correlation. The result will be an array of width of the image.
Unless you are searching for a point in a smooth region, maximum value in your array (peak) is expected to be your matching point.
Alternatively, you can try SIFT/SURF feature but I am not expert on those. I only know there are functions you can use in Matlab (such as detectSURFfeatures).
Note that if you are using two different cameras, you have to calibrate both of them.

Triangulation to find distance to the object- Image to world coordinates

Localization of an object specified in the image.
I am working on the project of computer vision to find the distance of an object using stereo images.I followed the following steps using OpenCV to achieve my objective
1. Calibration of camera
2. Surf matching to find fundamental matrix
3. Rotation and Translation vector using svd as method is described in Zisserman and Hartley book.
4. StereoRectify to get the projection matrix P1, P2 and Rotation matrices R1, R2. The Rotation matrices can also be find using Homography R=CameraMatrix.inv() H Camera Matrix.
Problems:
i triangulated point using least square triangulation method to find the real distance to the object. it returns value in the form of [ 0.79856 , .354541 .258] . How will i map it to real world coordinates to find the distance to an object.
http://www.morethantechnical.com/2012/01/04/simple-triangulation-with-opencv-from-harley-zisserman-w-code/
Alternative approach:
Find the disparity between the object in two images and find the depth using the given formula
Depth= ( focal length * baseline ) / disparity
for disparity we have to perform the rectification first and the points must be undistorted. My rectification images are black.
Please help me out.It is important
Here is the detail explanation of how i implemented the code.
Calibration of Camera using Circles grid to get the camera matrix and Distortion coefficient. The code is given on the Github (Andriod).
2.Take two pictures of a car. First from Left and other from Right. Take the sub-image and calculate the -fundmental matrix- essential matrix- Rotation matrix- Translation Matrix....
3.I have tried to projection in two ways.
Take the first image projection as identity matrix and make a second project 3x4d through rotation and translation matrix and perform Triangulation.
Get the Projection matrix P1 and P2 from Stereo Rectify to perform Triangulation.
My object is 65 meters away from the camera and i dont know how to calculate this true this based on the result of triangulation in the form of [ 0.79856 , .354541 .258]
Question: Do i have to do some extra calibration to get the result. My code is not based to know the detail of geometric size of the object.

So you already computed the triangulation? Well, then you have points in camera coordinates, i.e. in the coordinate frame centered on one of the cameras (the left or right one depending on how your code is written and the order in which you feed your images to it).
What more do you want? The vector length (square root of the sum of the square coordinates) of those points is their estimated distance from the same camera. If you want their position in some other "world" coordinate system, you need to give the coordinate transform between that system and the camera - presumably through a calibration procedure.

Project 2d points in camera 1 image to camera 2 image after a stereo calibration

I am doing stereo calibration of two cameras (let's name them L and R) with opencv. I use 20 pairs of checkerboard images and compute the transformation of R with respect to L. What I want to do is use a new pair of images, compute the 2d checkerboard corners in image L, transform those points according to my calibration and draw the corresponding transformed points on image R with the hope that they will match the corners of the checkerboard in that image.
I tried the naive way of transforming the 2d points from [x,y] to [x,y,1], multiply by the 3x3 rotation matrix, add the rotation vector and then divide by z, but the result is wrong, so I guess it's not that simple (?)
Edit (to clarify some things):
The reason I want to do this is basically because I want to validate the stereo calibration on a new pair of images. So, I don't actually want to get a new 2d transformation between the two images, I want to check if the 3d transformation I have found is correct.
This is my setup:
I have the rotation and translation relating the two cameras (E), but I don't have rotations and translations of the object in relation to each camera (E_R, E_L).
Ideally what I would like to do:
Choose the 2d corners in image from camera L (in pixels e.g. [100,200] etc).
Do some kind of transformation on the 2d points based on matrix E that I have found.
Get the corresponding 2d points in image from camera R, draw them, and hopefully they match the actual corners!
The more I think about it though, the more I am convinced that this is wrong/can't be done.
What I am probably trying now:
Using the intrinsic parameters of the cameras (let's say I_R and I_L), solve 2 least squares systems to find E_R and E_L
Choose 2d corners in image from camera L.
Project those corners to their corresponding 3d points (3d_points_L).
Do: 3d_points_R = (E_L).inverse * E * E_R * 3d_points_L
Get the 2d_points_R from 3d_points_R and draw them.
I will update when I have something new

It is actually easy to do that but what you're making several mistakes. Remember after stereo calibration R and L relate the position and orientation of the second camera to the first camera in the first camera's 3D coordinate system. And also remember to find the 3D position of a point by a pair of cameras you need to triangulate the position. By setting the z component to 1 you're making two mistakes. First, most likely you have used the common OpenCV stereo calibration code and have given the distance between the corners of the checker board in cm. Hence, z=1 means 1 cm away from the center of camera, that's super close to the camera. Second, by setting the same z for all the points you are saying the checker board is perpendicular to the principal axis (aka optical axis, or principal ray), while most likely in your image that's not the case. So you're transforming some virtual 3D points first to the second camera's coordinate system and then projecting them onto the image plane.
If you want to transform just planar points then you can find the homography between the two cameras (OpenCV has the function) and use that.

Can I reuse Homography matrix calculated from 2 different images of same scene taken by 2 different cameras?

I'm trying to learn OpenCV. I've a question regarding homography and epipolar geometry.
Suppose I've calculated homography using cvFindHomography() function using two static images' matched feature points taken with two cameras from two different view points.
Is it wrong if I reuse homography matrix to detect corresponding points in camera 1(right) from the image taken by camera2(left)(because I know that x' = H.x where x' is left images' 2d homogenous feature point, x is right images' 2d corresponding homogenous feature point and H is homography matrix) where the 2d points in camera1 and camera2 were not used to calculate homography matrix?
What I mean to ask is can I reuse calculated homography matrix of those two cameras to find corresponding points for any images that is not used to calculate homography matrix?
Does it matter which image I use when it was once determined by fixed images? or do i need to calculate it every time?

You can use homography to project points from one image to another as long as cameras don't move anymore and the scene doesn't change.
I understand that those cameras (calibrated) take the pictures and then you work with those two pictures all the time. Allright, if you calculate homography, then you can project all the points you want from both images. You will get some error, of course, but this is due to noise in the images and non-linearities that affect to linear method used by findhomography.
If you keep capturing images with the cameras then you have to compute homography again for every new pair of images, because you don't know a priori how the scene will change.

Distance to the object using stereo camera

Is there a way to calculate the distance to specific object using stereo camera?
Is there an equation or something to get distance using disparity or angle?

NOTE: Everything described here can be found in the Learning OpenCV book in the chapters on camera calibration and stereo vision. You should read these chapters to get a better understanding of the steps below.
One approach that do not require you to measure all the camera intrinsics and extrinsics yourself is to use openCVs calibration functions. Camera intrinsics (lens distortion/skew etc) can be calculated with cv::calibrateCamera, while the extrinsics (relation between left and right camera) can be calculated with cv::stereoCalibrate. These functions take a number of points in pixel coordinates and tries to map them to real world object coordinates. CV has a neat way to get such points, print out a black-and-white chessboard and use the cv::findChessboardCorners/cv::cornerSubPix functions to extract them. Around 10-15 image pairs of chessboards should do.
The matrices calculated by the calibration functions can be saved to disc so you don't have to repeat this process every time you start your application. You get some neat matrices here that allow you to create a rectification map (cv::stereoRectify/cv::initUndistortRectifyMap) that can later be applied to your images using cv::remap. You also get a neat matrix called Q, which is a disparity-to-depth matrix.
The reason to rectify your images is that once the process is complete for a pair of images (assuming your calibration is correct), every pixel/object in one image can be found on the same row in the other image.
There are a few ways you can go from here, depending on what kind of features you are looking for in the image. One way is to use CVs stereo correspondence functions, such as Stereo Block Matching or Semi Global Block Matching. This will give you a disparity map for the entire image which can be transformed to 3D points using the Q matrix (cv::reprojectImageTo3D).
The downfall of this is that unless there is much texture information in the image, CV isn't really very good at building a dense disparity map (you will get gaps in it where it couldn't find the correct disparity for a given pixel), so another approach is to find the points you want to match yourself. Say you find the feature/object in x=40,y=110 in the left image and x=22 in the right image (since the images are rectified, they should have the same y-value). The disparity is calculated as d = 40 - 22 = 18.
Construct a cv::Point3f(x,y,d), in our case (40,110,18). Find other interesting points the same way, then send all of the points to cv::perspectiveTransform (with the Q matrix as the transformation matrix, essentially this function is cv::reprojectImageTo3D but for sparse disparity maps) and the output will be points in an XYZ-coordinate system with the left camera at the center.

I am still working on it, so I will not post entire source code yet. But I will give you a conceptual solution.
You will need the following data as input (for both cameras):
camera position
camera point of interest (point at which camera is looking)
camera resolution (horizontal and vertical)
camera field of view angles (horizontal and vertical)
You can measure the last one yourself, by placing the camera on a piece of paper and drawing two lines and measuring an angle between these lines.
Cameras do not have to be aligned in any way, you only need to be able to see your object in both cameras.
Now calculate a vector from each camera to your object. You have (X,Y) pixel coordinates of the object from each camera, and you need to calculate a vector (X,Y,Z). Note that in the simple case, where the object is seen right in the middle of the camera, the solution would simply be (camera.PointOfInterest - camera.Position).
Once you have both vectors pointing at your target, lines defined by these vectors should cross in one point in ideal world. In real world they would not because of small measurement errors and limited resolution of cameras. So use the link below to calculate the distance vector between two lines.
Distance between two lines
In that link: P0 is your first cam position, Q0 is your second cam position and u and v are vectors starting at camera position and pointing at your target.
You are not interested in the actual distance, they want to calculate. You need the vector Wc - we can assume that the object is in the middle of Wc. Once you have the position of your object in 3D space you also get whatever distance you like.
I will post the entire source code soon.

I have the source code for detecting human face and returns not only depth but also real world coordinates with left camera (or right camera, I couldn't remember) being origin. It is adapted from source code from "Learning OpenCV" and refer to some websites to get it working. The result is generally quite accurate.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart