Recreate the 3D outlines of a City street in iOS SceneKit with OSM XML data - ios

What is best strategy to recreate part of a street in iOS SceneKit using .osm XML data?
Please assume part of a street is offered in the OSM XML data and contains the necessary geopoints with latitude and longitude denoting the Nodes to describe the paths/footprints of 6 buildings (i.e. ground floor plans that line the side of a street).
Specifically, what's the best strategy to convert latitude and longitude Nodes in order to locate these building footprints/polygons on the ground floor in a scene within SceneKit iOS? (i.e. running through position 0,0,0)? Thank you.

Very roughly and briefly, based on my own experience with 3D map rendering:
Transform the XML data from lat/long to appropriate coordinates for a 2D map (that is, project it to a plane using a map projection, then apply a 2D affine transform to get it into screen pixel coordinates). Create a 2D map that's wider and taller than the actual screen, because of what's going to happen in step 2:
Using a 3D coordinate system with your map vertical (i.e., set all the Z coordinates to zero), rotate the map so that it reclines at an appropriate shallow angle, as if you're in an aeroplane looking down on it; the angle might be 30 degrees from horizontal. To rotate the map you'll need to create a 3D rotation matrix. The axis of rotation will be the X axis: that is, the horizontal line that is the bottom border of your 2D map. The rotation is exactly the same as what happens when you rotate your laptop screen away from you.
Supply the new 3D coordinates to your rendering system. I haven't used SceneKit but I had a quick look at the documentation and you can use any coordinate system you like, so you will be able to use one that is convenient for the process I have just described: something that uses units the size of a screen pixel at the viewing plane, with Y going upwards, X going right, and Z going away from the viewer.
One final caveat: if you want to add extrusions giving a rough approximation of the 3D building shapes (such data is available in OSM for some areas) note that my scheme requires the tops of buildings, and indeed anything above ground level, to have negative Z coordinates.

Pretty simple. First, convert Your CLLocationCoordinate2D to a MKMapPoint, which is exactly the same as a CGRect. Second, scale down the MKMapPoint by some arbitrary number so it fits in with how you want it on your scene graph, let's say by 200. Since scenekit's coordinate system is centered at (0,0), you'll need to make sure your location is correct. Then just create your scnvector3's with the x/y of he MKMapPoint, and you will be locked to coordinates.

Related

How do you get the UIKit coordinates of an ARKit feature point?

I have an ARSCNView and I am tracking feature points in the scene. How would I get the 2D coordinates of the feature points (as in the coordinates of that point in the screen) from the 3D world coordinates of the feature point?
(Essentially the opposite of sceneView.hitTest)
Converting a point from 3D space (usually camera or world space) to 2D view (pixel) space is called projecting that point. (Because it involves a projection transform that defines how to flatten the third dimension.)
ARKit and SceneKit both offer methods for projecting points (and unprojecting points, the reverse transform that requires extra input on how to extrapolate the third dimension).
Since you're working with ARSCNView, you can just use the projectPoint method. (That's inherited from the superclass SCNView and defined in the SCNSceneRenderer protocol, but still applies in AR because ARKit world space is the same as SceneKit world/scene/rootNode space.) Note you'll need to convert back and forth between float3 and SCNVector3 for that method.
Also note the returned "2D" point is still a 3D vector — the x and y coordinates are screen pixels (well, "points" as in UIKit layout units), and the third is a relative depth value. Just make a CGPoint from the first two coordinates for something you can use with other UIKit API.
BTW, if you're using ARKit without SceneKit, there's also a projectPoint method on ARCamera.

finding the depth in arkit with SCNVector3Make

the goal of the project is to create a drawing app. i want it so that when i touch the screen and move my finger it will follow the finger and leave a cyan color paint. i did created it BUT there is one problem. the paint DEPTH is always randomly placed.
here is the code, just need to connect the sceneView with the storyboard.
https://github.com/javaplanet17/test/blob/master/drawingar
my question is how do i make the program so that the depth will always be consistent, by consistent i mean there is always distance between the paint and the camera.
if you run the code above you will see that i have printed out all the SCNMatrix4, but i none of them is the DEPTH.
i have tried to change hitTransform.m43 but it only messes up the x and y.
If you want to get a point some consistent distance in front of the camera, you don’t want a hit test. A hit test finds the real world surface in front of the camera — unless your camera is pointed at a wall that’s perfectly parallel to the device screen, you’re always going to get a range of different distances.
If you want a point some distance in front of the camera, you need to get the camera’s position/orientation and apply a translation (your preferred distance) to that. Then to place SceneKit content there, use the resulting matrix to set the transform of a SceneKit node.
The easiest way to do this is to stick to SIMD vector/matrix types throughout rather than converting between those and SCN types. SceneKit adds a bunch of new accessors in iOS 11 so you can use SIMD types directly.
There’s at least a couple of ways to go about this, depending on what result you want.
Option 1
// set up z translation for 20 cm in front of whatever
// last column of a 4x4 transform matrix is translation vector
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.2
// get camera transform the ARKit way
let cameraTransform = view.session.currentFrame.camera.transform
// if we wanted, we could go the SceneKit way instead; result is the same
// let cameraTransform = view.pointOfView.simdTransform
// set node transform by multiplying matrices
node.simdTransform = cameraTransform * translation
This option, using a whole transform matrix, not only puts the node a consistent distance in front of your camera, it also orients it to point the same direction as your camera.
Option 2
// distance vector for 20 cm in front of whatever
let translation = float3(x: 0, y: 0, z: -0.2)
// treat distance vector as in camera space, convert to world space
let worldTranslation = view.pointOfView.simdConvertPosition(translation, to: nil)
// set node position (not whole transform)
node.simdPosition = worldTranslation
This option sets only the position of the node, leaving its orientation unchanged. For example, if you place a bunch of cubes this way while moving the camera, they’ll all be lined up facing the same direction, whereas with option 1 they’d all be in different directions.
Going beyond
Both of the options above are based only on the 3D transform of the camera — they don’t take the position of a 2D touch on the screen into account.
If you want to do that, too, you’ve got more work cut out for you — essentially what you’re doing is hit testing touches not against the world, but against a virtual plane that’s always parallel to the camera and a certain distance away. That plane is a cross section of the camera projection frustum, so its size depends on what fixed distance from the camera you place it at. A point on the screen projects to a point on that virtual plane, with its position on the plane scaling proportional to the distance from the camera (like in the below sketch):
So, to map touches onto that virtual plane, there are a couple of approaches to consider. (Not giving code for these because it’s not code I can write without testing, and I’m in an Xcode-free environment right now.)
Make an invisible SCNPlane that’s a child of the view’s pointOfView node, parallel to the local xy-plane and some fixed z distance in front. Use SceneKit hitTest (not ARKit hit test!) to map touches to that plane, and use the worldCoordinates of the hit test result to position the SceneKit nodes you drop into your scene.
Use Option 1 or Option 2 above to find a point some fixed distance in front of the camera (or a whole translation matrix oriented to match the camera, translated some distance in front). Use SceneKit’s projectPoint method to find the normalized depth value Z for that point, then call unprojectPoint with your 2D touch location and that same Z value to get the 3D position of the touch location with your camera distance. (For extra code/pointers, see my similar technique in this answer.)

Turn an entire SceneKit scene into an image suitable for a texture

I've written a little app using CoreMotion, AV and SceneKit to make a simple panorama. When you take a picture, it maps that onto a SK rectangle and places it in front of whatever CM direction the camera is facing. This is working fine, but...
I would like the user to be able to click a "done" button and turn the entire scene into a single image. I could then map that onto a sphere for future viewing rather than re-creating the entire set of objects. I don't need to stitch or anything like that, I want the individual images to remain separate rectangles, like photos glued to the inside of a ball.
I know about snapshot and tried using that with a really wide FOV, but that results in a fisheye view that does not map back properly (unless I'm doing it wrong). I assume there is some sort of transform I need to apply? Or perhaps there is an easier way to do this?
The key is "photos glued to the inside of a ball". You have a bunch of rectangles, suspended in space. Turning that into one image suitable for projection onto a sphere is a bit of work. You'll have to project each rectangle onto the sphere, and warp the image accordingly.
If you just want to reconstruct the scene for future viewing in SceneKit, use SCNScene's built in serialization, write(to:​options:​delegate:​progress​Handler:​) and SCNScene(named:).
To compute the mapping of images onto a sphere, you'll need some coordinate conversion. For each image, convert the coordinates of the corners into spherical coordinates, with the origin at your point of view. Change the radius of each corner's coordinate to the radius of your sphere, and you now have the projected corners' locations on the sphere.
It's tempting to repeat this process for each pixel in the input rectangular image. But that will leave empty pixels in the spherical output image. So you'll work in reverse. For each pixel in the spherical output image (within the 4 corner points), compute the ray (trivially done, in spherical coordinates) from POV to that point. Convert that ray back to Cartesian coordinates, compute its intersection with the rectangular image's plane, and sample at that point in your input image. You'll want to do some pixel weighting, since your output image and input image will have different pixel dimensions.

Aruco scales coordinates wrong

I am using the (newly released) ArUco 2.0.7 to track some markers.
The camera that I am using is mounted to the ceiling facing down, so I only need the x and y coordinates.
It can view an area of 2.6m by 1.5m. If I understand the documentation correctly, I supply the sidelength of the markers I'm using in an arbitrary unit, the output of the pose will be in the same unit.
So the markers have a sidelength of 19.5cm. As I want my result in meters, I have that value set to 0.195.
However, the results I obtain are not correct. If I place the markers right in the corners of the field of view of the camera, they are not at the corresponding expected x and y coordinates.
I am placing the global origin on one of the corners of the field of view, e.g. (0,0) is the bottom left corner. This is done by transforming all incoming positions into that markers coordinate system using the matrix transforms obtained by getRTMatrix().
Everything seems to be working, except the x and y coordinates are in a wrong unit or scaled. The rotation works perfectly.
Am I missing something? Or can I not expect a good accuracy? The error is significant, e.g. when it should be (2.6,1.5), it is displayed as (1.8, 1), which is roughly an error of 33%.
After some more thought I figured out that simply my camera was calibrated using a smaller distance from the calibration board to the lens than what I need for my use case.
This caused the distortion coefficients the be wrong, thus giving me a bogus scale.
I re-calibrated using the aruco_calibration tool and am now accurate to roughly 3 or 4 cm, which is good enough for me.

Mapping lat/lon coordinates to a bitmap image of a map, not fixed to one projection

I'm currently developing a small piece of (Java) software that should be able to display maps and the current GPS position within that map.
I'm absolutely new to this, but it is pretty obvious that I'll have to do some kind of coordinate transformation.
I've found "Proj4J", which seems to be able to do a lot for me.
Now, what I have and what I want to do:
I have a bitmap of a map. The projection of this map can be any "well-defined" one, like Lambert or Mercator. I cannot fix this to one projection.
I have GPS coordinates from a "standard" GPS receiver. I believe they are lat/lon in WGS84, is that correct?
Now my questions:
I must map the GPS position to basically "screen coordinates" in my map bitmap. And for that, I assume, reference points are needed for which I know their lat/lon and corresponding pixel positions. Since my map can easily cover a couple of hundred kilometers in range, a linear interpolation between the known points and an arbitrary position is probably not correct for all types of projections, am I right on that?
I've read "Convert long/lat to pixel x/y on a given picure" so far, but this deals with a Mercator projection and I believe a linear approximation will work better than for a Lambert map.
I imagine the whole process is as follows:
"Calibrate" the map, i. e. identify two positions of known lat/lon in the bitmap and thus get their pixel position.
Use the Proj.4-transformation from "lat/lon WGS84" to "map projection" to map those reference points from (1.) into map coordinates.
Take the points from (2.) and map them again to a projection that will allow linear interpolation of the pixel positions, I'll call that the "pixel projection".
Now I have two reference points with coordinates in the "pixel projection" and their corresponding pixel positions.
For a lat/lon value from the GPS receiver do the following:
Convert the position to a map position using the "map projection".
Take the map position from (1.) and convert it to a coordinate using the "pixel projection" from above.
Since all distances in the "pixel projection" are maintained (that is the condition of the pixel projection!), an interpolation of the resulting coordinates from (2.) with the known position of the reference points from above can be made.
Here the big questions:
Is this the way to go, using a final "pixel projection" to allow linear interpolation?
What type of projection would that be and can that be done with Proj.4?
Can the "way back" - I have a pixel position and want lat/lon be accomplished (like "pixel position" -> "pixel projection" -> "map projection" -> "lat/lon")?
Thank you very much,
Jens.

Resources