If I am not wrong ARKit does not support vertical surface detection.
But
According to https://developer.apple.com/news/
and
https://developer.apple.com/news/?id=01242018b
OS 11 is the biggest AR platform in the world, allowing you to create unparalleled augmented reality experiences for hundreds of millions of iOS users. Now you can build even more immersive experiences by taking advantage of the latest features of ARKit, available in iOS 11.3 beta. With improved scene understanding, your app can see and place virtual objects on vertical surfaces, and more accurately map irregularly shaped surfaces.
Does it mean that version 1.5 can able to detect vertical surface too. ?
Related
There's a bit of a delay when detecting planes using ARCore. That's fine, but what do you do when you want to place an object on a plane as the user pans the phone?
With the delay, the object pops into the screen after the plane is detected, rather than appearing as panned, which isn't realistic.
Let's compare two leading AR SDKs
LiDAR scanner in iOS devices for ARKit 4.0
Since an official release of ARKit 3.5 there's a support for brand-new Light Detection And Ranging scanner allowing considerably reduce a time required for detecting a vertical and/or horizontal planes (it operates at nano-second speed). Apple has implemented this sensor on the rear camera of iPad Pro 2020. LiDAR scanner (that is basically direct ToF) gives us almost instant polygonal mesh of real-world environment in AR app, which is suitable for People/Objects Occlusion feature, precise ZDepth-object-placement and a complex collision shape for dynamics. A working distance of Apple LiDAR scanner is up to 5 meters. LiDAR scanner helps you detect planes in poorly-lit rooms with no feature points on walls and a floor.
iToF cameras in Android Devices for ARCore 1.18
3D indirect Time-of-Flight sensor is a sort of scannerless LiDAR. It also surveys the surrounding environment and accurately measures a distance. Although LiDARs and iToFs at their core are almost the same things, a scanner type is more accurate since it uses multiple laser pulses versus just one large flash laser pulse. In Android world, Huawei and Samsung, for instance, include scannerless 3D iToF sensors in their smartphones. Google Pixel 4 doesn't have iToF camera. A working distance of iToF sensor is up to 5 meters and more. Let's see what Google says about its brand-new Depth API:
Google's Depth API uses a depth-from-motion algorithm to create depth maps, which you can obtain using acquireDepthImage() method. This algorithm takes multiple device images from different angles and compares them to estimate the distance to every pixel as a user moves their phone. If the device has an active depth sensor, such as a time-of-flight sensor (or iToF sensor), that data is automatically included in the processed depth. This enhances the existing depth map and enables depth even when the camera is not moving. It also provides better depth on surfaces with few or no features, such as white walls, or in dynamic scenes with moving people or objects.
Recommendations
When you're using AR app built on ARCore without iToF sensor support, you need to detect planes in a well-lit environment containing a rich and unique wall and floor textures (you needn't track repetitive textures or textures like "polka dot"). Also, you may use Augmented Images feature to quickly get ARAnchors with a help of image detection algorithm.
Conclusion
Plane Detection is a very fast stage in case you're using LiDAR or iToF sensors. But for devices without LiDAR or iToF (when you're using ARKit 3.0 and lower or ARCore 1.17 and lower) there will be some delay at plane detection stage.
If you need more details about LiDAR scanner, read my story on Medium.
I know that ARKit is able to detect and classify planes on A12+ processors. It does the job reasonably well inside the house, but what about the outside? Is it able to detect windows and doors if I move around a house a little? I tried it myself and the result did not satisfy me: i moved around the building too much and still ARKit did not distinguish wall from the window.
I used app from here for tests: https://developer.apple.com/documentation/arkit/tracking_and_visualizing_planes
I’m I doing everything correct? Maybe there is some third party library to detect house parts better?
Thanks in advance!
When you test the sample app outside and try to use ARKit to detect the surfaces on the exterior of a house it will not work. ARKit is built to map flat surfaces and their orientations (horizontal/vertical). This means ARKit can understand that a surface is flat, is either a wall or a floor. When you attempt to "map" the exterior of a house, ARKit will only detect the horizontal surfaces as walls, it cannot distinguish between walls and windows.
You will need to develop/source an AI model and run it against the camera data using CoreML to enable your app to distinguish between windows and walls on the exterior of a house.
ARKit Plane tracking documentation for reference: https://developer.apple.com/documentation/arkit/tracking_and_visualizing_planes
a couple articles about ARKit with CoreML
https://www.rightpoint.com/rplabs/dev/arkit-and-coreml
https://medium.com/s23nyc-tech/using-machine-learning-and-coreml-to-control-arkit-24241c894e3b
[Update]
Yes you are correct, for A12+ devices Apple does allow for plane classification. I would assume the issue with exterior windows vs interior is either distance to the window (too far for the CV to properly classify) or Apple has tuned it more for interior windows vs exterior. The difference may seem trivial but to a CV algorithm it's quite different.
I am trying to find out if the accuracy, plane detection and World Tracking of the ARKit will be better in iPhone 8 Plus and iPhone X comparing to an iPhone 7.
I googled it and I read thru this webpage.
There is no indication of dual cameras, no explanation of specs of the camera and if the processor power or better cameras in lates devices will make ARKit more accurate (read here).
I am working on an accuracy-related arkit app and I'd like to know more about this topic.
ARKit doesn't use the dual camera, so there's no functional difference between an iPhone (pick a number) Plus or iPhone X and other devices.
Apple's marketing claims that iPhone 8 / 8 Plus and iPhone X are factory calibrated for better / more precise AR, but makes no definition of baseline vs improved precision to measure by.
That's about all Apple's said or is publicly known.
Beyond that, it's probably safe to assume that even if there's no difference to the world tracking algorithms and camera / motion sensor inputs to those algorithms, the increased CPU / GPU performance of A11 either gives your app more overhead to spend its performance budget on spiffy visual effects or lets you do more with AR before running the battery down. So you can say "better" about newer vs older devices, in general, but not necessarily in a way that has anything to do with algorithmic accuracy.
There's a tiny bit of room for ARKit to be "better" on, say, iPhone X or 8 Plus vs iPhone 8, and especially on iPad vs iPhone — and again it's more about performance differences than functional differences. Among devices of the same processor generation (A10, A11, etc), those devices with larger batteries and larger physical enclosures are less thermally constrained — so the combination of ARKit plus your rendering engine and game/app code will get to spend more time pushing the silicon at full speed than it will on a smaller device.
The VPS(visual positioning services) impressed me a lot. I know visual position method based on AR-Markers, however it is hard to do the visual positioning in the open environment (without known markers)? I guess they may use sensors in the smart phone to get difference of world coordinates, which may be used in the calculation.
Does anyone know how Google do the indoor positioning in the open environment? Thanks.
VPS is in closed beta right now so finding out the specifics would probably be a breach of non-disclosure agreements.
However Google Tango's current development stack for determining absolute positioning in Euclidean space is achieved with three core software/hardware technologies.
Motion Tracking (achieved through a wide angle "fisheye" monochrome camera in conjunction with the IMU, Gyroscopes, magnetometers and accelerometers within the device.
Depth Perception (achieved through a "time of flight" InfraRed emitter and receiver which creates a dense point cloud of depth measurements).
Area Learning (achieved through the RGB camera in conjunction with the Fisheye and IR sensors, which 'maps' areas in the point cloud and 'remembers' their location)
The main focus here is that Tango isn't just a software stack, it is hardware dependant too. You can only develop Tango software on a Tango enabled device such as the Lenovo Phab 2 Pro.
You could always sign up for the Tango VPS closed beta and find out more that way?
Apple's description of the iPhone 6 includes this text
Improved face detection
The iSight camera now recognizes faces faster
and more accurately — even those farther away or in a crowd — for
better portraits and group shots. And it improves blink and smile
detection and selection of faces in burst mode to automatically
capture your best shots.
I've used iOS face detection before, both from Core Graphics (docs) and AVFoundation (docs). I see no mention in those libraries of improvements for iPhone 6, or elsewhere in Apple literature.
My app is showing near-identical detection times for both libraries running on an iPhone 6.
Is there any documentation on this "improved face detection"? Specifically, is this updates to the existing libraries or something new that should be included?