My setup has checkerboard charts with known world coordinates present in each image that I use to stitch images together (in a 2D plane) and to find my P-matrix. However, I am stuck on finding a general approach into combining all my images into a spherical image.
Known:
Ground truth correspondence points in each image
camera calibration parameters (camera matrix, distortion coefficients)
homography between images
world-image plane matrix: P = K[R | t] for each image. However I think this matrix's estimation isn't that great.
real world coordinates of ground truthed points
camera has almost only rotation, minimal translation
I know openGL well enough to do the spherical/texture wrapping once I can stitch the images into a cubemap format
Unknown:
Spherical image
image cubemap
Related
Consider the following problem:
I have the original image A saved as "A.png".
Moreover, I also have camera video feed that shows (possibly with some perspective transformation) an image of A, denoted Va, with some level of radial distortion.
I also have an homography from A to Va and its inverse.
How could I undistort Va? Note that I do not want to undo the perspective transformation, just remove the radial distortion from Va.
Example:
I have a fully mapped and undistorted reference image (including real world size)
an image from a video frame (left image)
and an homography and its inverse between those two
In our use case, the left image would have radial distortion but we would like to remove it without applying a simple backprojection (this would create artifacts)
Undistortion is the process to transform a distorted image (e.g. image with a fisheye pattern) to undistorted version.
In this case your video frames do not suffer from distortion. And if you have already determined homography matrix, you need to apply perspective transformation.
You might need to invert the homography matrix in case you need to invert the transformation direction.
I am trying to understand mapping points between two images of same scene except the camera positions are different. say like this apologies for the rough sketch and the hand-writing. Sample image taken from cam1 and Sample image taken from cam2 . Trying to map between these two images. since the two cameras used are same(logitech camera). I assume camera calibration isn't required. So with the help of SIFT descriptors and feature matching, using the good matches from the images as inputs to Homography with RANSAC. I get 3*3 matrix. To verify the view mapping. I select few objects(say bins in the image) in cam1 image and try to map the same object in cam2 image using 3 * 3 matrix by using warp_perspective, but the outputs aren't good. say something like this had selected top left and bottom right of the objects in cam1 image(i.e. bins) and trying to draw a bounding box for the desired object in cam2 image.
But as visible in the view map output image the bounding boxes aren't proper to the bins.
Wanted to understand, where am i going wrong. Is it the camera positions affecting, and this shouldn't be used for homography or have to use multiple homographies or have to get to know the translation between the camera positions. very confused. Thank you.
Homography transforms plane into a plane. It can only be used if all of the matches lay on a plane in real world (e.g. on the planar wall) or the feature points are located far from both cameras so the transformation between the cameras might be expressed as pure rotation. See this link for further explanation.
In your case the objects are located at different depths so you need to perform stereo calibration of cameras and then compute the depth map to be able to map pixels from one camera into another.
I am attempting camera calibration from a single RGB image (panorama) given 3D pointcloud
The methods that I have considered all require an intrinsic properties matrix (which I have no access to)
The intrinsic properties matrix can be estimated using the Bouguet’s camera calibration Toolbox, but as I have said, I have a single image only and a single point cloud for that image.
So, knowing 2D image coordinates, extrinsic properties, and 3D world coordinates, how can the intrinsic properties be estimated?
It would seem that the initCameraMatrix2D function from the OpenCV (https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html) works in the same way as the Bouguet’s camera calibration Toolbox and requires multiple images of the same object
I am looking into the Direct linear transformation DLT and Levenberg–Marquardt algorithm with implementations https://drive.google.com/file/d/1gDW9zRmd0jF_7tHPqM0RgChBWz-dwPe1
but it would seem that both use the pinhole camera model and therefore find linear transformation between 3D and 2D points
I can't find my half year old source code, but from top of my head
cx, cy is optical centre which is width/2, height/2 in pixels
fx=fy is focal length in pixels (distance from camera to image plane or axis of rotation)
If you know that image distance from camera to is for example 30cm and it captures image that has 16x10cm and 1920x1200 pixels, size of pixel is 100mm/1200=1/12mm and camera distance (fx,fy) would be 300mm*12px/1mm=3600px and image centre is cx=1920/2=960, cy=1200/2=600. I assume that pixels are square and camera sensor is centered at optical axis.
You can get focal lenght from image size in pixels and measured angle of view.
I have used openCV to calculate the homography relating to views of the same plane by using features and matching them. Is there any way to recover the plane itsself or the plane normal from this homography? (I am looking for an equation where H is the input and the normal n is the output.)
If you have the calibration of the cameras, you can extract the normal of the plane, but not the distance to the plane (i.e. the transformation that you obtain is up to scale), as Wikipedia explains. I don't know any implementation to do it, but here you are a couple of papers that deal with that problem (I warn you it is not straightforward): Faugeras & Lustman 1988, Vargas & Malis 2005.
You can recover the real translation of the transformation (i.e. the distance to the plane) if you have at least a real distance between two points on the plane. If that is the case, the easiest way to go with OpenCV is to first calculate the homography, then obtain four points on the plane with their 2D coordinates and the real 3D ones (you should be able to obtain them if you have a real measurement on the plane), and using PnP finally. PnP will give you a real transformation.
Rectifying an image is defined as making epipolar lines horizontal and lying in the same row in both images. From your description I get that you simply want to warp the plane such that it is parallel to the camera sensor or the image plane. This has nothing to do with rectification - I’d rather call it an obtaining a bird’s-eye view or a top view.
I see the source of confusion though. Rectification of images usually involves multiplication of each image with a homography matrix. In your case though each point in sensor plane b:
Xb = Hab * Xa = (Hb * Ha^-1) * Xa, where Ha is homography from the plane in the world to the sensor a; Ha and intrinsic camera matrix will give you a plane orientation but I don’t see an easy way to decompose Hab into Ha and Hb.
A classic (and hard) way is to find a Fundamental matrix, restore the Essential matrix from it, decompose the Essential matrix into camera rotation and translation (up to scale), rectify both images, perform a dense stereo, then fit a plane equation into 3d points you reconstruct.
If you interested in the ground plane and you operate an embedded device though, you don’t even need two frames - a top view can be easily recovered from a single photo, camera elevation from the ground (H) and a gyroscope (or orientation vector) readings. A simple diagram below explains the process in 2D case: first picture shows how to restore Z (depth) coordinate to every point on the ground plane; the second picture shows a plot of the top view with vertical axis being z and horizontal axis x = (img.col-w/2)*Z/focal; Here is img.col is image column, w - image width, and focal is camera focal length. Note that a camera frustum looks like a trapezoid in a birds eye view.
I am doing stereo calibration of two cameras (let's name them L and R) with opencv. I use 20 pairs of checkerboard images and compute the transformation of R with respect to L. What I want to do is use a new pair of images, compute the 2d checkerboard corners in image L, transform those points according to my calibration and draw the corresponding transformed points on image R with the hope that they will match the corners of the checkerboard in that image.
I tried the naive way of transforming the 2d points from [x,y] to [x,y,1], multiply by the 3x3 rotation matrix, add the rotation vector and then divide by z, but the result is wrong, so I guess it's not that simple (?)
Edit (to clarify some things):
The reason I want to do this is basically because I want to validate the stereo calibration on a new pair of images. So, I don't actually want to get a new 2d transformation between the two images, I want to check if the 3d transformation I have found is correct.
This is my setup:
I have the rotation and translation relating the two cameras (E), but I don't have rotations and translations of the object in relation to each camera (E_R, E_L).
Ideally what I would like to do:
Choose the 2d corners in image from camera L (in pixels e.g. [100,200] etc).
Do some kind of transformation on the 2d points based on matrix E that I have found.
Get the corresponding 2d points in image from camera R, draw them, and hopefully they match the actual corners!
The more I think about it though, the more I am convinced that this is wrong/can't be done.
What I am probably trying now:
Using the intrinsic parameters of the cameras (let's say I_R and I_L), solve 2 least squares systems to find E_R and E_L
Choose 2d corners in image from camera L.
Project those corners to their corresponding 3d points (3d_points_L).
Do: 3d_points_R = (E_L).inverse * E * E_R * 3d_points_L
Get the 2d_points_R from 3d_points_R and draw them.
I will update when I have something new
It is actually easy to do that but what you're making several mistakes. Remember after stereo calibration R and L relate the position and orientation of the second camera to the first camera in the first camera's 3D coordinate system. And also remember to find the 3D position of a point by a pair of cameras you need to triangulate the position. By setting the z component to 1 you're making two mistakes. First, most likely you have used the common OpenCV stereo calibration code and have given the distance between the corners of the checker board in cm. Hence, z=1 means 1 cm away from the center of camera, that's super close to the camera. Second, by setting the same z for all the points you are saying the checker board is perpendicular to the principal axis (aka optical axis, or principal ray), while most likely in your image that's not the case. So you're transforming some virtual 3D points first to the second camera's coordinate system and then projecting them onto the image plane.
If you want to transform just planar points then you can find the homography between the two cameras (OpenCV has the function) and use that.