Does anyone have information on how to map the texture-image of a Google photo-sphere image onto a sphere yourself (not using the google api)? My goal is todo it myself in matlab but I was unable to find any information about the mapping coordinates.
Thanks in advance,
Thomas
you can find details about the metadata of a Photo Sphere here:
https://developers.google.com/photo-sphere/metadata/
Essentially, the image uses equirectangular projection [1] which you only need to map on the inside of a sphere, and put the camera at the center.If the Photo Sphere is a full 360/180 degree panorama, then you can map the whole of the inside of a sphere. If it is only a partial panorama, then you can use the metadata inside the photo to determine the exact position you need to place the texture.
[1] https://en.wikipedia.org/wiki/Equirectangular_projection
Did you try using the warp function?
a = imread('PANO.jpg');
[x,y,z] = sphere(200);
warp(x,y,z,a);
Using camera tools, you may set the camera position inside the sphere, but I think outside view can be enough with an adequate level of zoom.
Related
I have 2D image data with respective camera location in latitude and longitude. I want to translate pixel co-ordinates to 3D world co-ordinates. I have access to intrinsic calibration parameters and Yaw, pitch and roll. Using Yaw, pitch and roll I can derive rotation matrix but I am not getting how to calculate translation matrix. As I am working on data set, I don't have access to camera physically. Please help me to derive translation matrix.
Cannot be done at all if you don't have the elevation of the camera with respect to the ground (AGL or ASL) or another way to resolve the scale from the image (e.g. by identifying in the image an object of known size, for example a soccer stadium in an aerial image).
Assuming you can resolve the scale, the next question is how precisely you can (or want to) model the terrain. For a first approximation you can use a standard geodetical ellipsoid (e.g. WGS-84). For higher precision - especially for images shot from lower altitudes - you will need use a DTM and register it to the images. Either way, it is a standard back-projection problem: you compute the ray from the camera centre to the pixel, transform it into world coordinates, then intersect with the ellipsoid or DTM.
There are plenty of open source libraries to help you do that in various languages (e.g GeographicLib)
Edited to add suggestions:
Express your camera location in ECEF.
Transform the ray from the camera in ECEF as well taking into account the camera rotation. You can both transformations using a library, e.g. nVector.
Then proceeed to intersect the ray with the ellipsoid, as explained in this answer.
I've written a little app using CoreMotion, AV and SceneKit to make a simple panorama. When you take a picture, it maps that onto a SK rectangle and places it in front of whatever CM direction the camera is facing. This is working fine, but...
I would like the user to be able to click a "done" button and turn the entire scene into a single image. I could then map that onto a sphere for future viewing rather than re-creating the entire set of objects. I don't need to stitch or anything like that, I want the individual images to remain separate rectangles, like photos glued to the inside of a ball.
I know about snapshot and tried using that with a really wide FOV, but that results in a fisheye view that does not map back properly (unless I'm doing it wrong). I assume there is some sort of transform I need to apply? Or perhaps there is an easier way to do this?
The key is "photos glued to the inside of a ball". You have a bunch of rectangles, suspended in space. Turning that into one image suitable for projection onto a sphere is a bit of work. You'll have to project each rectangle onto the sphere, and warp the image accordingly.
If you just want to reconstruct the scene for future viewing in SceneKit, use SCNScene's built in serialization, write(to:options:delegate:progressHandler:) and SCNScene(named:).
To compute the mapping of images onto a sphere, you'll need some coordinate conversion. For each image, convert the coordinates of the corners into spherical coordinates, with the origin at your point of view. Change the radius of each corner's coordinate to the radius of your sphere, and you now have the projected corners' locations on the sphere.
It's tempting to repeat this process for each pixel in the input rectangular image. But that will leave empty pixels in the spherical output image. So you'll work in reverse. For each pixel in the spherical output image (within the 4 corner points), compute the ray (trivially done, in spherical coordinates) from POV to that point. Convert that ray back to Cartesian coordinates, compute its intersection with the rectangular image's plane, and sample at that point in your input image. You'll want to do some pixel weighting, since your output image and input image will have different pixel dimensions.
OpenCV docs for solvePnp
In an augmented reality app, I detect the image in the scene so I know imagePoints, but the object I'm looking for (objectPoints) is a virtual marker just stored in memory to search for in the scene, so I don't know where it is in space. The book I'm reading(Mastering OpenCV with Practical Computer Vision Projects ) passes it as if the marker is a 1x1 matrix and it works fine, how? Doesn't solvePnP needs to know the size of the object and its projection so we know who much scale is applied ?
Assuming you're looking for a physical object, you should pass the 3D coordinates of the points on the model which are mapped (by projection) to the 2D points in the image. You can use any reference frame, and the results of the solvePnp will give you the position and orientation of the camera in that reference frame.
If you want to get the object position/orientation in camera space, you can then transform both by the inverse of the transform you got from solvePnp, so that the camera is moved to the origin.
For example, for a cube object of size 2x2x2, the visible corners may be something like: {-1,-1,-1},{1,-1,-1},{1,1,-1}.....
You have to pass the 3D coordinates of the real-world object that you want to map with the image. The scaling and rotation values will depend on the coordinate system that you use.
This is not as difficult as it sounds. See this blog post on head pose estimation. for more details with code.
I'm trying to get the 3D coordinate of a point from the triangulation of two view.
I'm not really sure if my results are correct or not.
For example I'm not really sure if the sign of the coordinates are correct because I'm not sure how the camera frame is oriented.
The z axis positive verse is entering or exiting the image plane?
And x and y? Do they follow the right hand rule?
Thanks in advance for a clarification!
The coordinate system is set according to the image and the description on this webpage
I have a problem that involves a UAV flying with a camera mounted below it. Following information is provided:
GPS Location of the UAV in Lat/Long
GPS Height of the UAV in meters
Attitude of the UAV i.e. roll, pitch, and yaw in degrees
Field of View (FOV) of the camera in degrees
Elevation of the camera w.r.t UAV in degrees
Azimuth of camera w.r.t UAV in degrees
I have some some images taken from that camera during a flight and my task is to compute the locations (in Lat/Long) of 4 corners points and the center points of the image so that the image can be placed on the map at proper location.
I found a document while searching the internet that can be downloaded at the following link:
http://www.siaa.asn.au/get/2411853249.pdf
My maths background is very weak so I am not able to translate the document into a working solution.
Can somebody provide me a solution to my problem in the form of a simple algorithm or preferable in the form of code of some programming language?
Thanks.
As I see, it does not related to image-processing, because you need to determine coordinates of center of image (you even do not need FOV). You have to find intersection of camera principal ray and earth surface (if I've understood your task well). This is nothing more then basic matrix math.
See wiki:Transformation.