I start with two pan/tilt/zoom cameras oriented so they can both see an ArUco marker. These cameras have been calibrated to determine the camera matrix, and the size of the marker is known. I can determine the rotation and translation vectors from the marker to each camera, and the cameras' current pan, tilt, and zoom positions are known.
If I move the marker and turn one camera to follow it, I can note the new pan/tilt/zoom position and determine the new rotation and translation vectors. Now I want to turn the second camera to face the marker. How can I determine the new pan and tilt settings required?
I think I understand how to build a combined transformation matrix from the two sets of rotation and translation vectors, but I don't know how to account for changing pan/tilt values.
I put some code in a Google Colab to better illustrate what I'm trying to do.
Related
How can I calculate the distance of an object of known size (e.g. aruco marker of 0.14m printed on paper) from camera. I know the camera matrix (camMatx) and my fx,fy ~= 600px assuming no distortion. From this data I am able to calculate the pose of the aruco marker and have obtained [R|t]. Now the task is to get the distance of the aruco marker from the camera. I also know the height of the camera from ground plane (15m).
How should I go about solving this problem. Any help would be appreciated. Also please note I have also seen approach of similar triangles, but that would work on knowing the distance of the object, which doesnt apply in my case as I have to calculate the distance.
N.B: I dont know the camera sensor height. But I know how high the camera is located above ground.
I know the dimensions of the area in which my object is moving (70m x 45m). In the end I would like to plot the coordinate of the moving object on a 2D map drawn to the scale.
I have 2D image data with respective camera location in latitude and longitude. I want to translate pixel co-ordinates to 3D world co-ordinates. I have access to intrinsic calibration parameters and Yaw, pitch and roll. Using Yaw, pitch and roll I can derive rotation matrix but I am not getting how to calculate translation matrix. As I am working on data set, I don't have access to camera physically. Please help me to derive translation matrix.
Cannot be done at all if you don't have the elevation of the camera with respect to the ground (AGL or ASL) or another way to resolve the scale from the image (e.g. by identifying in the image an object of known size, for example a soccer stadium in an aerial image).
Assuming you can resolve the scale, the next question is how precisely you can (or want to) model the terrain. For a first approximation you can use a standard geodetical ellipsoid (e.g. WGS-84). For higher precision - especially for images shot from lower altitudes - you will need use a DTM and register it to the images. Either way, it is a standard back-projection problem: you compute the ray from the camera centre to the pixel, transform it into world coordinates, then intersect with the ellipsoid or DTM.
There are plenty of open source libraries to help you do that in various languages (e.g GeographicLib)
Edited to add suggestions:
Express your camera location in ECEF.
Transform the ray from the camera in ECEF as well taking into account the camera rotation. You can both transformations using a library, e.g. nVector.
Then proceeed to intersect the ray with the ellipsoid, as explained in this answer.
I have stereo images (non co-planar cameras), which have matching points on a plane (wall) labeled.
I need to compute the camera locations in world space, and the angles they are focused at.
I can work out the math if I need to (with effort), I wonder if there is a shortcut to doing these computations in OpenCV that I might not be familiar with?
If they are both looking at a plane, all you have to do is estimate homographies (with findHomography) independently between the plane and each camera, then decompose them (decomposeHomographyMat) to get rotation and translation up to scale. To resolve scale you need to know the distance between at least two of the points on the plane.
We are currently using opencv to track a planar rectangular target. While directly straight(no pitch), this works perfectly using findContours with solvePnp and returns a very accurate location of the target.
The problem is, is that obviously we get the different results once we increase the pitch. We know the pitch of the camera at all time.
How would I "cancel out" the pitch of the camera, and obtain coordinates as if the camera was facing straight ahead?
In the general case you can use an affine transform to map the quadrilateral seen by the camera back to the original rectangle. In your case the quadrilateral seen by the camera may be a good approximation of a parallelogram since only one angle is changing, but in real-world applications you can generally assume that the camera can have non-zero values for each of the three rotations (e.g. in pitch, yaw, and roll).
http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html
The transform allows you to calculate the matching coordinates (x,y) within the rectangle's plane given coordinates (x', y') in the image of the rectangle.
I have a problem that involves a UAV flying with a camera mounted below it. Following information is provided:
GPS Location of the UAV in Lat/Long
GPS Height of the UAV in meters
Attitude of the UAV i.e. roll, pitch, and yaw in degrees
Field of View (FOV) of the camera in degrees
Elevation of the camera w.r.t UAV in degrees
Azimuth of camera w.r.t UAV in degrees
I have some some images taken from that camera during a flight and my task is to compute the locations (in Lat/Long) of 4 corners points and the center points of the image so that the image can be placed on the map at proper location.
I found a document while searching the internet that can be downloaded at the following link:
http://www.siaa.asn.au/get/2411853249.pdf
My maths background is very weak so I am not able to translate the document into a working solution.
Can somebody provide me a solution to my problem in the form of a simple algorithm or preferable in the form of code of some programming language?
Thanks.
As I see, it does not related to image-processing, because you need to determine coordinates of center of image (you even do not need FOV). You have to find intersection of camera principal ray and earth surface (if I've understood your task well). This is nothing more then basic matrix math.
See wiki:Transformation.