I have a problem in which i have a stationary video camera in a room and several videos from it, i need to transform the image coordinates into world coordinates.
What i know:
1. all the measurements of the room.
2. 16 image coordinates and their respected world coordinates.
The problem i encounter:
At first i thought i just need to create a geometric transformation (According to http://xenia.media.mit.edu/~cwren/interpolator/), but i have a problem since the edge of the room are distorted in the image, and i cant calibrate the camera because i can't get a hold of the room or the camera.
Is there anyway i can overcome those difficulties and measure the distance in the room with some accuracy?
Thanks
You can calibrate the distortion of the camera by extracting first the edges of your room and then finding the best set of distortion parameters (that will minimize edge distortion).
There are few works that implement this approach though:
you can find a skeleton of distortion estimation procedure in R. Szeliski's book, but without an implementation;
alternatively, you can find a method + implementation (+ an online demo where you can upload your images) on IPOL.
Regarding the perspective distortion, after removing the lens distortion just proceed with the link that you have found by applying this method to the image of the four corners of the room floor.
This will give you the mapping between an image pixel and a ground pixel (and thus the object world coordinate, assuning you only want the X-Y coordinates). If you need the height measurement, then you need to find an object with a known height in your images to calibrate it too.
Related
I have 2 images for the same object from different views. I want to form a camera calibration, but from what I read so far I need to have a 3D world points to get the camera matrix.
I am stuck at this step, who can explain it to me
Popular camera calibration methods use 2D-3D point correspondences to determine the projective properties (intrinsic parameters) and the pose of a camera (extrinsic parameters). The most simple approach is the Direct Linear Transformation (DLT).
You might have seen, that often planar chessboards are used for camera calibrations. The 3D coordinates of it's corners can be chosen by the user itself. Many people choose the chessboard being in x-y plane [x,y,0]'. However, the 3D coordinates need to be consistent.
Coming back to your object: Span your own 3D coordinate system over the object and find at least six spots, from which you can determine easy their 3D position. Once you have that, you have to find their corresponding 2D positions (pixel) in your two images.
There are complete examples in OpenCV. Maybe you get a better picture when reading the code.
I am tracking a moving vehicle with a stereo camera system. In both images I use background segmentation to get only the moving parts in the pictures, then put a rectangle around the biggest object.
Now I want to get the 3D coordinates of the center of the rectangle. The identified centers in the two 2D pictures are almost correlating points (I know not exactly). I did a stereo calibration with MATLAB, so I have the intrinsic parameters of both cameras and the extrinsic parameters of the stereo system.
OpenCV doesn't provide any function for doing this as far as I know and to be honest reading Zisserman didn't really help me, but maybe I am just blind to the obvious.
This should work:
1. For both camera's, compute a ray from your camera origin through the rectangle's center.
2. Convert the rays to world coordinates.
3. Compute the intersection between the two rays (or the closest point, in case they do not exactly intersect)
I'm taking camera images of white paper with black square that has specified size (e.g. 10 cm). Image is taken with different distance to paper plane and with different camera angle.
Now I need to deduce from those images camera rotation, camera translation and distance to paper plane as well distance to squares corners.
I'm quite new to image processing so maybe somebody can direct me to some keywords, algorithms or basic math to look for or even OpenCV functions to investigate. On the paper there will be always some primitive objects like squares so I don't need some algorithms that will work any arbitary image but I will definitely need a fast algorithm.
To calculate camera rotation and translation you need to follow sevral steps that are always the same in this kind of problems:
Run a detector on a sample of the image (FAST)
Run a detector on all images you want to process, could be a frame captured from video.
Generate descriptors of points detected (SIFT).
Match descriptors with a matcher (flannMatcher)
Find homography form matched pairs (findHomography())
Find camera pose from homography.
You have some links to the methods in this tutorial.
I have a set of 3-d points and some images with the projections of these points. I also have the focal length of the camera and the principal point of the images with the projections (resulting from previously done camera calibration).
Is there any way to, given these parameters, find the automatic correspondence between the 3-d points and the image projections? I've looked through some OpenCV documentation but I didn't find anything suitable until now. I'm looking for a method that does the automatic labelling of the projections and thus the correspondence between them and the 3-d points.
The question is not very clear, but I think you mean to say that you have the intrinsic calibration of the camera, but not its location and attitude with respect to the scene (the "extrinsic" part of the calibration).
This problem does not have a unique solution for a general 3d point cloud if all you have is one image: just notice that the image does not change if you move the 3d points anywhere along the rays projecting them into the camera.
If have one or more images, you know everything about the 3D cloud of points (e.g. the points belong to an object of known shape and size, and are at known locations upon it), and you have matched them to their images, then it is a standard "camera resectioning" problem: you just solve for the camera extrinsic parameters that make the 3D points project onto their images.
If you have multiple images and you know that the scene is static while the camera is moving, and you can match "enough" 3d points to their images in each camera position, you can solve for the camera poses up to scale. You may want to start from David Nister's and/or Henrik Stewenius's papers on solvers for calibrated cameras, and then look into "bundle adjustment".
If you really want to learn about this (vast) subject, Zisserman and Hartley's book is as good as any. For code, look into libmv, vxl, and the ceres bundle adjuster.
Suppose I've got two images taken by the same camera. I know the 3d position of the camera and the 3d angle of the camera when each picture was taken. I want to extract some 3d data from the images on the portion of them that overlaps. It seems that OpenCV could help me solve this problem, but I can't seem to find where my camera position and angle would be used in their method stack. Help? Is there some other C library that would be more helpful? I don't even know what keywords to search for on the web. What's the technical term for overlapping image content?
You need to learn a little more about camera geometry, and stereo rig geometry. Unless your camera was mounted on a special rig, it's rather doubtful that its pose at each image can be specified with just an angle and a point. Rather, you'd need three angles (e.g. roll, pitch, yaw). Plus, if you want your reconstruction to be metrical accurate, you need to calibrate accurately the focal length of the camera (at a minimum).