Is there a way to add Scene Reconstruction using ARGeoTrackingConfiguration in Xcode?
It appears it is only available for ARWorldTrackingConfiguration.
At the moment it's impossible and it's easy to explain why. Scene Reconstruction feature is a computationally-intensive task (due to processing of LiDAR data, number of stored polygons, tracked anchors and classifications). And I am not counting the fact that you need simultaneous plane detection, raycasting, rendering of PBR shaders, shadows and physics at 60 fps.
But Geo Tracking is a highly expensive task too. Your gadget must process URLSession data, ML data, IMU data and GPS data in real time. I suppose we need a considerably powerful iOS device with a capacious battery for running Scene Reconstruction during Geo Tracking.
Related
I load a model in the AR environment and add an ARAnchor to stabilize the model. When I place the device on the desktop and picked up later. The model’s position is not changed, but it will fly away soon. The coordinate system of ARKit will fly and be unstable.
How to avoid or deal with this situation
ARKit/RealityKit world tracking system is based on a combination of five sensors:
Rear RGB Camera
LiDAR Scanner
Gyroscope
Accelerometer
Magnetometer
Three latter ones are known as Inertial Measurement Unit (IMU) that operates at 1000 fps. But what sees your RGB Camera (running at 60 fps) and LiDAR (also at 60 fps) is very important too.
Hence, a stability of world tracking greatly depends on camera image.
Here are some recommendations for high-quality tracking:
Track only well-lit environment (if you don't have LiDAR)
Track only static objects (not moving)
Don't track poor-textured surfaces like white walls (if you don't have LiDAR)
Don't track surfaces with repetitive texture pattern (like Polka Dots)
Don't track mirrors, chrome and glass objects (reflective and refractive)
Move your iPhone slowly when tracking
Don't shake iPhone when tracking
Track as much environment as possible
Track high-contrast objects in environment (if you don't have LiDAR)
If you follow these recommendations, coordinate system in ARKit will be stable.
And look at the picture in this SO post – there are a good example for tracking and a bad one.
What sensors does ARCore use: single camera, dual-camera, IMU, etc. in a compatible phone?
Also, is ARCore dynamic enough to still work if a sensor is not available by switching to a less accurate version of itself?
Updated: May 10, 2022.
About ARCore and ARKit sensors
Google's ARCore, as well as Apple's ARKit, use a similar set of sensors to track a real-world environment. ARCore can use a single RGB camera along with IMU, what is a combination of an accelerometer, magnetometer and a gyroscope. Your phone runs world tracking at 60fps, while Inertial Measurement Unit operates at 1000Hz. Also, there is one more sensor that can be used in ARCore – iToF camera for scene reconstruction (Apple's name is LiDAR). ARCore 1.25 supports Raw Depth API and Full Depth API.
Read what Google says about it about COM method, built on Camera + IMU:
Concurrent Odometry and Mapping – An electronic device tracks its motion in an environment while building a three-dimensional visual representation of the environment that is used for fixing a drift in the tracked motion.
Here's Google US15595617 Patent: System and method for concurrent odometry and mapping.
in 2014...2017 Google tended towards Multicam + DepthCam config (Tango project)
in 2018...2020 Google tended to SingleCam + IMU config
in 2021 Google returned to Multicam + DepthCam config
We all know that the biggest problem for Android devices is a calibration. iOS devices don't have this issue ('cause Apple controls its own hardware and software). A low quality of calibration leads to errors in 3D tracking, hence all your virtual 3D objects might "float" in a poorly-tracked scene. In case you use a phone without iToF sensor, there's no miraculous button against bad tracking (and you can't switch to a less accurate version of tracking). The only solution in such a situation is to re-track your scene from scratch. However, a quality of tracking is much higher when your device is equipped with ToF camera.
Here are four main rules for good tracking results (if you have no ToF camera):
Track your scene not too fast, not too slow
Track appropriate surfaces and objects
Use well lit environment when tracking
Don't track reflected of refracted objects
Horizontal planes are more reliable than vertical ones
SingleCam config vs MultiCam config
The one of the biggest problems of ARCore (that's ARKit problem too) is an Energy Impact. We understand that the higher frame rate is – the better tracking's results are. But the Energy Impact at 30 fps is HIGH and at 60 fps it's VERY HIGH. Such an energy impact will quickly drain your smartphone's battery (due to an enormous burden on CPU/GPU). So, just imagine that you use 2 cameras for ARCore – your phone must process 2 image sequences at 60 fps in parallel as well as process and store feature points and AR anchors, and at the same time, a phone must simultaneously render animated 3D graphics with Hi-Res textures at 60 fps. That's too much for your CPU/GPU. In such a case, a battery will be dead in 30 minutes and will be as hot as a boiler)). It seems users don't like it because this is not-good AR experience.
What is the purpose of "bake" option in SceneKit editor. Does it have an impact on performance?
Type offers 2 options: Ambient Occlusion and Light Map
Destination offers: Texture and Vertex
For me, it crashes Xcode. It's supposed to render lighting (specifically shadows) into the textures on objects so you don't need static lights.
This should, theoretically, mean that all you need in your scene are the lights used to create dynamic lighting on objects that move, and you can save all the calculations required to fill the scene with static lights on static geometry.
In terms of performance, yes, baking in the lighting can create a HUGE jump in performance because it's saving you all the complex calculations that create ambient light, occlusion and direct shadows and soft shadows.
If you're using ambient occlusion and soft shadows in real-time you'll be seeing VERY low frame rates.
And the quality possible with baking is far beyond what you can achieve with a super computer in real time, particularly in terms of global illumination.
What's odd is that Scene Kit has a bake button. It has never worked for me, always crashing Xcode. But the thing is... to get the most from baking, you need to be a 3D artist, in which case you'll be much more inclined to do the baking in a 3D design app.
And 3D design apps have lighting solutions that are orders of magnitude better than the best Scene Kit lighting possible. I can't imagine that there's really a need for baking in Scene Kit. It's a strange thing for the development team to have spent time on as it simply could never come close to the quality afforded by even the cheapest 3D design app.
What I remember from college days:
Baking is actually process in 3D rendering and textures. You have two kind of bakings: texture baking and physics baking.
Texture baking:
You calculate some data and save that data to a texture. You use that on your material. With that, you reduce rendering time. Every single frame, everything is calculated again and again. If you have animations, that is a lot of time wasted there.
Physics baking:
You can pre calculate physics simulations exactly like above baking and you use that data. For example you use it in Rigid Body.
I'm trying to develop an algorithm for real time tracking moving objects with a single moving camera setup as a project, in OpenCV (C++).
My basic objectives are
Detect motion in an (initially) static frame
Track that moving object (camera to follow that object)
Here is what I have tried already
Salient motion detection using temporal differencing and Optical Flow. (does not compensate for a moving camera)
KLT based feature tracking, but I was not able to segment the moving object features (moving object features got mixed with other trackable features in the image)
Mean shift based tracking (required initialization and is a bit computationally expensive)
I'm now trying to look into the following methods
Histogram of Gradients.
Algorithms that implement camera motion parameters.
Any advice on which direction should I proceed forward to acheive my objective.
Type: 'zdenek kalal predator' to google.com and watch the videos, read the papers that came up. I think it will give you a lot of insight.
How can we detect rapid motion and object simultaneously, let me give an example,....
suppose there is one soccer match video, and i want to detect position of each and every players with maximum accuracy.i was thinking about human detection but if we see soccer match video then there is nothing with human detection because we can consider human as objects.may be we can do this with blob detection but there are many problems with blobs like:-
1) I want to separate each and every player. so if players will collide then blob detection will not help. so there will problem to identify player separately
2) second will be problem of lights on stadium.
so is there any particular algorithm or method or library to do this..?
i've seen some research paper but not satisfied...so suggest anything related to this like any article,algorithm,library,any method, any research paper etc. and please all express your views in this.
For fast and reliable human detection, Dalal and Triggs' Histogram of Gradients is generally accepted as very good. Have you tried playing with that?
Since you mentioned rapid motion changes, are you worried about fast camera motion or fast player/ball motion?
You can do 2D or 3D video stabilization to fix camera motion (try the excellent Deshaker plugin for VirtualDub).
For fast player motion, background subtraction or other blob detection will definitely help. You can use that to get a rough kinematic estimate and use that as an estimate of your blur kernel. This can then be used to deblur the image chip containing the player.
You can do additional processing to establish identify based upon OCRing jersey numbers, etc.
You mentioned concern about lights on the stadium. Is the main issue that it will cast shadows? That can be dealt with by the HOG detector. Blob detection to get blur kernel should still work fine with the shadow.
If you have control over the camera, you may want to reduce exposure times to reduce blur. Denoising techniques can be used to reduce CCD noise that occurs with extreme low light and dense optical flow approaches align the frames and boost the signal back up to something reasonable via adding the denoised frames.