How to detect and visualize ceiling with RealityKit? - ios

Since RealityKit is new to me, apparently the plane anchors now have classification (floor, wall, ceiling, ...), so does anyone of you know the procedure of detecting ceilings (or any class of a plane)with RealityKit and ARKit and visualizing them using specific color (for instance red). Any information regarding this would be welcome.

ARKit/RealityKit can detect horizontal surfaces, so yes, it can definitely detect a ceiling. Of course, ARKit won't inherently know that it is a ceiling vs a floor or a table. You could compare the ARPlaneAnchor's y value to the phone's present y value and determine that it is likely a ceiling and add geometry/coloration based on that.

Related

Why we use fixed focus for AR tracking in ARCore?

I am using ARCore to track an image. Based on the following reference, the FOCUSMODE of the camera should be set to FIXED for better AR tracking performance.
Since for each frame we can get camera intrinsic parameter of focal length, why we need to use a fixed focus?
With Fixed Camera Focus ARCore can better calculate a parallax (no near or distant real-world objects must be out of focus), so your Camera Tracking will be reliable and accurate. At Tracking Stage, your gadget should be able to clearly distinguish all textures of surrounding objects and feature points – to build correct 3D scene.
Also, Scene Understanding stage requires fixed focus as well (to correctly detect planes, catch lighting intensity and direction, etc). That's what you expect from ARCore, don't you?
Fixed Focus also guarantees that your "in-focus" rendered 3D model will be placed in scene beside the real-world objects that are "in-focus" too. However, if we're using Depth API we can defocus real-world and virtual objects.
P.S.
In the future ARCore engineers may change the aforementioned behaviour of camera focus.

ARKit plane with real world object above it

Thanks in advance for reading my question. I am really new to ARKit and have followed several tutorials which showed me how to use plane detection and using different textures for the planes. The feature is really amazing but here is my question. Would it be possible for the player to place the plane all over the desired area first and then interact with the new ground? For example, could I use the plane detection to detect and put grass texture over an area and then drive a real RC car over it? Just like driving it on real grass.
I have tried out the plane detection on my iPhone 6s while what I found is when I tried to put anything from real world on the top of plane surface it just simply got covered by the plane. Could you please give me some clue if it is possible to make the plane just stay on the ground without covering the real world object?
I think that's sth what you are searching for:
ARKit hide objects behind walls
Or another way is i think to track the position of the real world object for example with apples turicreate or CoreML or both -> then don't draw your stuff on the affected position.
Tracking moving objects is not supported, that's actually what it would be needed to make a real object interact with the a virtual one.
Said that I would recommend you using 2D image recognition and "read" every camera frame to detect the object while moving in the camera's view space. Look for the AVCaptureVideoDataOutputSampleBufferDelegate protocol in Apple's developer site
Share your code and I could help with some ideas

Can ARCore track moving surfaces?

ARCore can track static surfaces according to its documentation, but doesn't mention anything about moving surfaces, so I'm wondering if ARCore can track flat surfaces (of course, with enough feature points) that can move around.
Yes, you definitely can track moving surfaces and moving objects in ARCore.
If you track static surface using ARCore – the resulted features are mainly suitable for so-called Camera Tracking. If you track moving object/surface – the resulted features are mostly suitable for Object Tracking.
You also can mask moving/not-moving parts of the image and, of course, inverse Six-Degrees-Of-Freedom (translate xyz and rotate xyz) camera transform.
Watch this video to find out how they succeeded.
Yes, ARCore tracks feature points, estimates surfaces, and also allows access to the image data from the camera, so custom computer vision algorithms can be written as well.
I guess it should be possible theoretically.
However, Ive tested it with some stuff in my HOUSE (running S8 and an app with unity and arcore)
and the problem is more or less that it refuses to even start tracking movable things like books and plates etc:
due to the feature points of the surrounding floor etc it always picks up on those first.
Edit: did some more testing and i Managed to get it to track a bed sheet, it does However not adjust to any movement. Meaning as of now the plane stays fixed allthough i saw some wobbling but i guess that Was because it tried to adjust the Positioning of the plane once it's original Feature points where moved.

Extrinsic Camera Calibration Using OpenCV's solvePnP Function

I'm currently working on an augmented reality application using a medical imaging program called 3DSlicer. My application runs as a module within the Slicer environment and is meant to provide the tools necessary to use an external tracking system to augment a camera feed displayed within Slicer.
Currently, everything is configured properly so that all that I have left to do is automate the calculation of the camera's extrinsic matrix, which I decided to do using OpenCV's solvePnP() function. Unfortunately this has been giving me some difficulty as I am not acquiring the correct results.
My tracking system is configured as follows:
The optical tracker is mounted in such a way that the entire scene can be viewed.
Tracked markers are rigidly attached to a pointer tool, the camera, and a model that we have acquired a virtual representation for.
The pointer tool's tip was registered using a pivot calibration. This means that any values recorded using the pointer indicate the position of the pointer's tip.
Both the model and the pointer have 3D virtual representations that augment a live video feed as seen below.
The pointer and camera (Referred to as C from hereon) markers each return a homogeneous transform that describes their position relative to the marker attached to the model (Referred to as M from hereon). The model's marker, being the origin, does not return any transformation.
I obtained two sets of points, one 2D and one 3D. The 2D points are the coordinates of a chessboard's corners in pixel coordinates while the 3D points are the corresponding world coordinates of those same corners relative to M. These were recorded using openCV's detectChessboardCorners() function for the 2 dimensional points and the pointer for the 3 dimensional. I then transformed the 3D points from M space to C space by multiplying them by C inverse. This was done as the solvePnP() function requires that 3D points be described relative to the world coordinate system of the camera, which in this case is C, not M.
Once all of this was done, I passed in the point sets into solvePnp(). The transformation I got was completely incorrect, though. I am honestly at a loss for what I did wrong. Adding to my confusion is the fact that OpenCV uses a different coordinate format from OpenGL, which is what 3DSlicer is based on. If anyone can provide some assistance in this matter I would be exceptionally grateful.
Also if anything is unclear, please don't hesitate to ask. This is a pretty big project so it was hard for me to distill everything to just the issue at hand. I'm wholly expecting that things might get a little confusing for anyone reading this.
Thank you!
UPDATE #1: It turns out I'm a giant idiot. I recorded colinear points only because I was too impatient to record the entire checkerboard. Of course this meant that there were nearly infinite solutions to the least squares regression as I only locked the solution to 2 dimensions! My values are much closer to my ground truth now, and in fact the rotational columns seem correct except that they're all completely out of order. I'm not sure what could cause that, but it seems that my rotation matrix was mirrored across the center column. In addition to that, my translation components are negative when they should be positive, although their magnitudes seem to be correct. So now I've basically got all the right values in all the wrong order.
Mirror/rotational ambiguity.
You basically need to reorient your coordinate frames by imposing the constraints that (1) the scene is in front of the camera and (2) the checkerboard axes are oriented as you expect them to be. This boils down to multiplying your calibrated transform for an appropriate ("hand-built") rotation and/or mirroring.
The basic problems is that the calibration target you are using - even when all the corners are seen, has at least a 180^ deg rotational ambiguity unless color information is used. If some corners are missed things can get even weirder.
You can often use prior info about the camera orientation w.r.t. the scene to resolve this kind of ambiguities, as I was suggesting above. However, in more dynamical situation, of if a further degree of automation is needed in situations in which the target may be only partially visible, you'd be much better off using a target in which each small chunk of corners can be individually identified. My favorite is Matsunaga and Kanatani's "2D barcode" one, which uses sequences of square lengths with unique crossratios. See the paper here.

How to position a car in image processing (computer vision)?

I would like to locate a car (front center point x,y) using a high resolution single camera. The camera setup is fixed at 1-2m high, and tilted around 25 degrees. The camera can provide images in where the front side of the car is visible. The intrinsic and extrinsic parameters are known.
So far, I tried to detect the headlights and number plates. Issues... Headlights are not detected as blobs all the time. The shape of the headlights are changing depending on the distance. Also, the number plate is not visible in the dark.
Is there a robust algorithm to detect a car? or to detect headlights? or detect number plate?How could I proceed?
Thanks in advance,
Are you detecting the same car everytime? If yes, then presumably the appearance remains consistent. Rather than detect and recognise blobs and shapes, you may be better off using scale and rotation invariant features combined with a machine learning algorithm. Look into the SIFT and SURF feature descriptors. For easy experimentation, use OpenCV's implementation of feature description and matching. Take a look at this example.
This is not an easy problem because of the change in the scale and point of view. Ideally, you would need a collection of training images with the car seen from different points of view to match later some of them to your input image. Then, you need local features (SIFT, SURF) or some classifier to decide on the match.
On the other hand, if you are tracking the same car all the time, check out the MeanShift algorithm. The problem is you need an initial position to carry on with the tracking.

Resources