Finding camera position without calibration - opencv

I have to find the camera position using several images from a football field. Using these images I can find the homography that transforms the points in the 3D model (in meters) into the points in the image (pixels). Each picture can have a different focal lengths, hence the camera intrinsics can change. I'm using OpenCV in C++.
I've found solutions for similar problems but either you already have the camera calibrated, with a fixed focal length (in this case you only need one image) or you don't have the camera calibrated but you consider that its intrinsic doesn't change (in this case you need at least 3 images).
Is there anyway to find the camera position in the 3D model without knowing anything besides the homographies (with different focal lengths)?

Short answer: No.
If you don't have any additional information about the cameras, there is no way to uniquely position the cameras. There would be multiple combinations of camera attributes & positions that would produce the same image.
You may be able to constrain the camera position along a certain 3D line, and reduce the problem to 1 degree of freedom, if that helps.

Related

Calibration of stationary video camera

I have a problem in which i have a stationary video camera in a room and several videos from it, i need to transform the image coordinates into world coordinates.
What i know:
1. all the measurements of the room.
2. 16 image coordinates and their respected world coordinates.
The problem i encounter:
At first i thought i just need to create a geometric transformation (According to http://xenia.media.mit.edu/~cwren/interpolator/), but i have a problem since the edge of the room are distorted in the image, and i cant calibrate the camera because i can't get a hold of the room or the camera.
Is there anyway i can overcome those difficulties and measure the distance in the room with some accuracy?
Thanks
You can calibrate the distortion of the camera by extracting first the edges of your room and then finding the best set of distortion parameters (that will minimize edge distortion).
There are few works that implement this approach though:
you can find a skeleton of distortion estimation procedure in R. Szeliski's book, but without an implementation;
alternatively, you can find a method + implementation (+ an online demo where you can upload your images) on IPOL.
Regarding the perspective distortion, after removing the lens distortion just proceed with the link that you have found by applying this method to the image of the four corners of the room floor.
This will give you the mapping between an image pixel and a ground pixel (and thus the object world coordinate, assuning you only want the X-Y coordinates). If you need the height measurement, then you need to find an object with a known height in your images to calibrate it too.

Why do we need to move the calibration object for pinhole camera calibration?

Is there any particular reason why we need multiple poses (e.g. varying z or rotation) to obtain the focal length and principal point for the camera matrix? In other words, is it sufficient to calibrate a pinhole camera with a single pose? i.e. by keeping the location of the calibration object (let's say a standard checkerboard) constant?
I assume you are asking in the context of OpenCV-like camera calibration using images of a planar target. The reference for the algorithm used by OpenCV is Z. Zhang's now classic paper . The discussion in the top half of page 6 shows that n >= 3 images are necessary for calibrating all 5 parameters of a pinhole camera matrix. Imposing constraints on the parameters reduces the number of needed images to a theoretical minimum of one.
In practice you need more for various reasons, among them:
The need to have enough measurements to overcome "noise" and "random" corner detection errors, while using a practical target with well-separated corners.
The difference between measuring data and observing (constraining) model parameters.
Practical limitations of physical lenses, e.g. depth of field.
As an example for the second point, the ideal target pose for calibrating the nonlinear lens distortion (barrel, pincushion, tangential, etc.) is frontal-facing, covering the whole field of view, because it produces a large number of well-separated and aligned corners over the image, all with approximately the same degree of blur. However, this is exactly the worst pose you can use in order to estimate the field of view / focal length, as for that purpose you need to observe significant perspective foreshortening.
Likewise, it is possible to show that the location of the principal point is well constrained by a set of images showing the vanishing points of multiple pencils of parallel lines. This is important because that location is inherently confused by the component parallel to the image plane of the relative motion between camera and target. Thus the vanishing points help "guide" the optimizer's solution toward the correct one, in the common case where the target does translate w.r.t the camera.

Correspondence between a set of 3D model points and their image projections

I have a set of 3-d points and some images with the projections of these points. I also have the focal length of the camera and the principal point of the images with the projections (resulting from previously done camera calibration).
Is there any way to, given these parameters, find the automatic correspondence between the 3-d points and the image projections? I've looked through some OpenCV documentation but I didn't find anything suitable until now. I'm looking for a method that does the automatic labelling of the projections and thus the correspondence between them and the 3-d points.
The question is not very clear, but I think you mean to say that you have the intrinsic calibration of the camera, but not its location and attitude with respect to the scene (the "extrinsic" part of the calibration).
This problem does not have a unique solution for a general 3d point cloud if all you have is one image: just notice that the image does not change if you move the 3d points anywhere along the rays projecting them into the camera.
If have one or more images, you know everything about the 3D cloud of points (e.g. the points belong to an object of known shape and size, and are at known locations upon it), and you have matched them to their images, then it is a standard "camera resectioning" problem: you just solve for the camera extrinsic parameters that make the 3D points project onto their images.
If you have multiple images and you know that the scene is static while the camera is moving, and you can match "enough" 3d points to their images in each camera position, you can solve for the camera poses up to scale. You may want to start from David Nister's and/or Henrik Stewenius's papers on solvers for calibrated cameras, and then look into "bundle adjustment".
If you really want to learn about this (vast) subject, Zisserman and Hartley's book is as good as any. For code, look into libmv, vxl, and the ceres bundle adjuster.

two images with camera position and angle to 3d data?

Suppose I've got two images taken by the same camera. I know the 3d position of the camera and the 3d angle of the camera when each picture was taken. I want to extract some 3d data from the images on the portion of them that overlaps. It seems that OpenCV could help me solve this problem, but I can't seem to find where my camera position and angle would be used in their method stack. Help? Is there some other C library that would be more helpful? I don't even know what keywords to search for on the web. What's the technical term for overlapping image content?
You need to learn a little more about camera geometry, and stereo rig geometry. Unless your camera was mounted on a special rig, it's rather doubtful that its pose at each image can be specified with just an angle and a point. Rather, you'd need three angles (e.g. roll, pitch, yaw). Plus, if you want your reconstruction to be metrical accurate, you need to calibrate accurately the focal length of the camera (at a minimum).

OpenCV + photogrammetry

i have a stereopair,
photo 1: http://savepic.org/1671682.jpg
photo 2: http://savepic.org/1667586.jpg
there is coordinate system in each image. How can I find coordinates of point A in this system using OpenCV library. It would be nice to see sample code.
I've looked for it at opencv.willowgarage.com/documentation/cpp/camera_calibration_and_3d_reconstruction.html but haven't found (or haven't understood :) )
Your 'stereo' images are fine. What you have already done is solve the correspondence problem: in both images you have indicated points 'A'. This means that you know which pixel corresponds to eachother labeling point 'A'.
What you want to do, is triangulate where your camera is. You can only do this by first calibrating your camera. This is inside of OpenCV already.
http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
This gives you the exact vector/ray of light for each vector, and the optical center of your cameras through which the ray passes. Moreover, you need stereo calibration. This establishes the orientation and position of each camera with respect through each other.
From that point on, your triangulation is simple, knowing the pixel location in both images of point 'A'. You have
Location and orientation of camera 1 and camera 2
Otical Ray Vector (pixel location) from the cameras to label 'A'.
So you have 2 locations in space, and 2 rays from these location. The intersection of these rays is your 3D answer.
Note that in practice there rays will never exactly intersect (2 lines in 3D rarely do), so you need to approximate. Use opencv function triangulatePoints(), using the input of the stereo calibration and the pixel index relating to label A.
Firstly of all this is not truly a stereo pair. A nice stereo pair needs to have 60%-80% overlap usually small rotation differences between images. Even if this pair had the necessary BASE to be a good stereo pair due to the extremely kappa rotation the resulting epipolar image would be useless.
Secondly among others you should take a look at the camera calibration and collinearity equations both supported by OpenCV
http://en.wikipedia.org/wiki/Camera_resectioning
http://en.wikipedia.org/wiki/Collinearity_equation
You need to understand the maths.
If the page isn't enough then you should look at the opencv book - it devotes a couple of chapters to this. Then there are a lot of textbooks that cover it in more detail

Resources