ARCore: How to get depth (distance from camera) at any arbitrary pixel? - arcore

I have what I thought was a very simple need, which is to get a 2d array of distances from the camera many times per second (like a LIDAR); for example, a 10x10 array of samples that are interpolated across the screen. I just want the distance to whatever the pixel is showing. I thought ARCore should be easily capable of this because it's correlating all the pixels with past frames so it should know where everything is.
I thought I am supposed to use hitTest() for this. I could call hitTest() 100 times per second. But hitTest() usually gives no results or inaccurate results. For example, it might detect a table, but not a wall or anything else. And, hitTest() seems to be very slow and laggy so I can't call it 100 times per second.
Am I doing something wrong? Also, would Apple's ARKit be a better choice for my needs? Or do I need to resort to external hardware which is better for getting actual distance?

ARcore does not currently support accurate depth sensing on the types of devices it can run on at this time (mid 2019).
See a note from the ARCore team here: https://github.com/google-ar/arcore-android-sdk/issues/206

Related

Am I missing something with stereo calibration?

I am trying the stereo_calib example and it fails with garbage output. For instance:
However, it is finding corners in my images...
My xml file and images are all here:
https://drive.google.com/open?id=12-5jBN7FK-LO6SLb4r3YYkrOnP7f_xmG
What am I doing wrong? I first tried printing a pattern on a sheet of paper, then thought ok that must be too wavy or something, so had this printed on foam board. But no dice.
(we chatted on a side channel, so this is to the benefit of the rest of the world)
tl;dr: hold the board very still or get a camera with global shutter.
Rolling shutter (see here and there), an attribute of most webcam sensors, many camcorder sensors, and some industrial image sensors, will distort objects that are moving. If you've moved the board even just a little during a frame capture (visible in files right19/right20), it will be captured with distortion. That will affect everything you do with the picture, starting with intrinsic calibration.
To give a sense of scale for the distortions: assuming a 30 FPS video stream, the worst case rolling shutter lag is 33 ms. A pedestrian travels 40-50 mm in that time. If your hands are moving slightly, you can maybe expect a tenth of that, which is still a lot in proportion to the square sizes most people use.
Another source of trouble is printers. If you've printed your checkerboard pattern, make sure to measure the width and height of your squares. they might be slightly rectangular. It's also a good idea to make sure the pattern is quite flat, not bent.

Get distance from coreMotion data

I have an arduino controlling an led light strip, and an iphone connected to the arduino via bluetooth. so the number of lights that are turned on correspond to the phones position along an x axis
Is it possible to use the accelerometer to estimate the distance the phone has traveled. i'm currently polling the accelerometer at 0.01 second intervals. so in 0.5 seconds i'll have an array of 50 values. I believe each value represents the g force at the instance it was measured, so 1.0 = 9.8 meters/second. What would be the formula to take this array and the time interval to calculate the distance? Am i reinventing the wheel here? i feel like arKit has to use some kind of position tracking similar to this. Is there anything in coreMotion that could accomplish this for me.
Obligatory apology for not knowing what i'm doing. also similar questions have been asked before but they are >2 years old and the answer then was its possible but not accurate. i assume it could be more accurate now because arkit wouldn't work without doing something like
No, this isn't practical. The problem is drift. You can't tell if the phone is still or moving at a constant velocity, and the accelerometer isn't accurate enough to "zero out" the velocity of the phone. Minor errors in your calculations almost immediately swamp your results and you can't tell if the phone is sitting still or moving at a constant speed.
Acceleration is the second derivative of position. To start with acceleration you have to integrate twice, which will magnify errors.
To do this you could have two bluetooth sensors (one at each end of the bar) and use triangulation to calculate position. I haven't done this calculation myself to know all the details of it, but it's the same idea as those bluetooth tags you can have on a bunch of items to help you locate your keys, etc.

How to getting movement size from 3 axis accelerometer data

I did a lot of experiment using the accelerometer for detecting the movement size(magnitude) just one value from x,y,z acceleration. I am using an iPhone 4 with accelerometer update frequency 1.0 / 50.0 (50HZ), but I've also tried with 100HZ, 150HZ, 200HZ.
Examples:
Acceleration on X axis
Acceleration on Y axis
Acceleration on Z axis
I assume ( I hope I am correct) that the accelerations are the small peaks on the graph, not the big steps. I think from my experiments that the big steps show the device position. If changed the position the step is changed too.
If my previous assumption is correct I need to cut the peaks from the graph and summarize them. Here comes my question how can I cut those peaks without losing the information, the peak sizes.
I know that the high pass filter does this kind of thinks(passes the high peaks and blocks the noise, the small ones, I've read some paper about the filters. But for me the filter cut a lot of information from my "signal"(accelerometer data).
I think that there should be a better way for getting the information out from the data.
I've tried a simple one which looks nice but it isn't correct.
I did this data data using my function magnitude
for i = 2 : length(x)
converted(i-1) = x(i-1) - x(i);
end
Where x is my data and converted array is the result.
The following row generated a the image below, which looks like nice.
xyz = magnitude(datay) + magnitude(dataz) + magnitude(datax)
However the problem with that solution is that if I have continuos acceleration the graph just will show the first point and then goes down. I know that I need somehow better filter, but I am bit confused. Could you give some advice how can I do this properly.
Thanks for your time,
I really appreciate your help
Edit(answers for Zaph question):
What are you trying to accomplish?
I want to measure the movement when the iPhone is placed to desk, chair or bed. The accelerometer is so sensible if I put down a pencil it to a desk it shows me. I want to measure all movement that happens in a specific time.
What are the scale units?
I'm not scaling the data.
When you say "device position" what do you mean, an accelerometer provides movement (in iPhones with gyros)
I am using only the accelerometer. When I put the device like the picture below I got values around -1 on x coordinate, 0.0 on z and y coordinate. This is what I mean as device position.
The measurements that are returned from the accelerometer are acceleration, not position.
I'm not sure what you mean with "big steps" but the peaks show a change of acceleration. The fact that the values are not 0 when holding the device still is from the fact that the gravitation accelerates the device with 9.81 m/s^2 (the magnitude of the acceleration vector).
You are potentially trying to do something quite difficult, especially the with low quality sensors that are embedded in phones. That is, getting the actual coordinate acceleration of the phone.
What you can do, is to detect the time periods when the phone was moved or touched. You can first calculate magnitude (norm) of acceleration signal and then, with a moving window, check areas where sample standard deviation is smaller than some threshold. Determining how the phone moved is more complicated issue. Of course you can check orientation for the stationary areas between movements.

Structure from Motion (SfM) in a tunnel-like structure?

I have a very specific application in which I would like to try structure from motion to get a 3D representation. For now, all the software/code samples I have found for structure from motion are like this: "A fixed object that is photographed from all angle to create the 3D". This is not my case.
In my case, the camera is moving in the middle of a corridor and looking forward. Sometimes, the camera can look on other direction (Left, right, top, down). The camera will never go back or look back, it always move forward. Since the corridor is small, almost everything is visible (no hidden spot). The corridor can be very long sometimes.
I have tried this software and it doesn't work in my particular case (but it's fantastic with normal use). Does anybody can suggest me a library/software/tools/paper that could target my specific needs? Or did you ever needed to implement something like that? Any help is welcome!
Thanks!
What kind of corridors are you talking about and what kind of precision are you aiming for?
A priori, I don't see why your corridor would not be a fixed object photographed from different angles. The quality of your reconstruction might suffer if you only look forward and you can't get many different views of the scene, but standard methods should still work. Are you sure that the programs you used aren't failing because of your picture quality, arrangement or other reasons?
If you have to do the reconstruction yourself, I would start by
1) Calibrating your camera
2) Undistorting your images
3) Matching feature points in subsequent image pairs
4) Extracting a 3D point cloud for each image pair
You can then orient the point clouds with respect to one another, for example via ICP between two subsequent clouds. More sophisticated methods might not yield much difference if you don't have any closed loops in your dataset (as your camera is only moving forward).
OpenCV and the Point Cloud Library should be everything you need for these steps. Visualization might be more of a hassle, but the pretty pictures are what you pay for in commercial software after all.
Edit (2017/8): I haven't worked on this in the meantime, but I feel like this answer is missing some pieces. If I had to answer it today, I would definitely suggest looking into the keyword monocular SLAM, which has recently seen a lot of activity, not least because of drones with cameras. Notably, LSD-SLAM is open source and may not be as vulnerable to feature-deprived views, as it operates directly on the intensity. There even seem to be approaches combining inertial/odometry sensors with the image matching algorithms.
Good luck!
FvD is right in the sense that your corridor is a static object. Your scenario is the same and moving around and object and taking images from multiple views. Your views are just not arranged to provide a 360 degree view of the object.
I see you mentioned in your previous comment that the data is coming from a video? In that case, the problem could very well be the camera calibration. A camera calibration tells the SfM algorithm about the internal parameters of the camera (focal length, principal point, lens distortion etc.) In the absence of knowledge about these, the bundler in VSfM uses information from the EXIF data of the image. However, I don't think video stores any EXIF information (not a 100% sure). As a result, I think the entire algorithm is running with bad focal length information and cannot solve for the orientation.
Can you extract a few frames from the video and see if there is any EXIF information?

Is it possible to use core motion for distance measurement [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Getting displacement from accelerometer data with Core Motion
Android accelerometer accuracy (Inertial navigation)
I am trying to use core motion user acceleration values, and double integrating them to derive distance covered. I move my iPhone linearly along its Y axis, against a 30 cm log ruler, on the table. First, I let the device be at rest for 10 seconds, and I calculate my offsets along the three axes, by averaging the respective user acceleration values.
The X, Y and Z offsets are subtracted from the acceleration values, when I try calculating the distance covered. After offset subtraction, these values are passed through a low pass filter and a median filter, separately of course. The filters are linear filters, and the cut-off frequency is specified by the number of neighbouring values whose mean is taken in low pass, and median in the median filter. I have experimented with varying values of this number from 1 to 100. In the end, these filtered values are double integrated using trapezoidal rule to get distances. But, the distance calculated is no where close to 30 cm. The closest value I got was some -22 cm(I am wondering why I am getting negative values even though I move the device in positive Y direction). I also came across this:
http://ajnaware.wordpress.com/2008/09/05/accelerating-iphones/
its an old post about the same thing, which says that the accelerometer readings returned appeared to come in quanta of about 0.18m/s^2 (ie. about 0.018g), resulting in a large cumulative error very quickly. Going by that, for this error to really not matter, one will have to accelerate the device by almost 1.8m/s^2, which is practically impossible for distance/length measurement purposes. for small movements, it does not look like there is a possibility of calculating distances by using an optimal filter and a higher order numerical integration method, without an impractical velocity/acceleration constraint like that. Is it possible?
How about using my acceleration vs timestamp data to interpolate a polynomial that grows over time, as I get more and more motion updates, which represents approximately an acceleration vs time curve. Double integration of ths polynomial would be a piece of cake. But, for small distances, the polynomial will have a big error component. Using a predictable known motion that my device will be subjected to, I wish to take a huge number of snapshots (calculated distance vs actual known distance) to calculate my error polynomial in a similar way, and then subtract it from my first polynomial. Can this work?
Although this does not fit StackOverflow, because it's not a question but a discussion, I'll try to sum up my thoughts about it.
As already said, the accelerometer is very inaccurate and you would need very good accuracy for this kind of task, especially if you are trying to measure such short distances. Plus, accelerometers differ from device to device, you will get different results for the same movements with different device. Plus a very huge random error.
My guess is, that you can get rid of a huge part of randomness/error by calibrating the device and making the "measurement move" a couple of times, like 10 times. After that you have enough data to get an average that might get close to the real value.
Calibration is a key part here, you have to think of a clever way to calibrate, like letting the user move the device over different distances in different speeds.
But all this is just theory. I would really like to see your results, but I doubt you get it working good enough even using the best possible filters/algorithms, since there is just too much noise.

Resources