track user translation movement on ios using sensors for vr game? - opencv

I'm starting to experiment VR game development on ios. I learned a lot from google cardboard sdk. It can track user's head orientation, but it can not track user's translation. This shortage cause the use can only look at the virtual environment from a fix location (I known I can add auto walk to the game, but it's just not the same).
I'm searching around the internet, some says translation tracking just can't be done by using sensors, but it seems combining magnetometer, you can track user's movement path, like this example.
I also found a different method called SLAM, which use camera and opencv to do some feature tracking, then use feature point informations to calculate translation. Here's some example from 13th Lab. And google has a Tango Project which is more advanced, but it require hardware support.
I'm quite new to this kind of topic, so I 'm wondering, if I want to track not only the head orientation but also the head(or body) translation movement in my game, which method should I choose. SLAM seems pretty good, but it's also pretty difficult, and I think it will has a big impact on the cpu.
If you are familiar with this topic, please give some advice, thanks in advance!

If high accuracy is not important, you can try using the accelerometer to detect walking movement (basically a pedometer) and multiply it with an average human step width. Direction can be determined by the compass / magnetometer.
High accuracy tracking would likely require complex algorithms such as SLAM, though many such algorithms have already been implemented in VR libraries such as Vuforia or Kudan

Hi I disagree with you Zhiquiang Li
Look at this video made with kudan, the video is quite stable and moreover my smartphone is a quite old phone.
https://youtu.be/_7zctFw-O0Y

Related

how does google measure app works on android?

I can see that it can measure horizontal and vertical distances with +/-5% accuracy. I have a use case scenario in which I am trying to formulate an algorithm to detect distances between two points in an image or video. Any pointers to how it could be working would be very useful to me.
I don't think the source is available for the Android measure app, but it is ARCore based and I would expect it uses a combination of triangulation and knowledge it reads from the 'scene', using the Google ARCore term, it is viewing.
Like a human estimating distance to a point, by basic triangulation between two eyes and the point being looked at, a measurement app is able to look at multiple views of the scene and to measure using its sensors how far the device has moved between the different views. Even a small movement allows the same triangulation techniques be used.
The reason for mentioning all this is to highlight that you do not have the same tools or information available to you if you are analysing image or video files without any position or sensor data. Hence, the Google measure app may not be the best template for you to look to for your particular problem.

Can ARCore track moving surfaces?

ARCore can track static surfaces according to its documentation, but doesn't mention anything about moving surfaces, so I'm wondering if ARCore can track flat surfaces (of course, with enough feature points) that can move around.
Yes, you definitely can track moving surfaces and moving objects in ARCore.
If you track static surface using ARCore – the resulted features are mainly suitable for so-called Camera Tracking. If you track moving object/surface – the resulted features are mostly suitable for Object Tracking.
You also can mask moving/not-moving parts of the image and, of course, inverse Six-Degrees-Of-Freedom (translate xyz and rotate xyz) camera transform.
Watch this video to find out how they succeeded.
Yes, ARCore tracks feature points, estimates surfaces, and also allows access to the image data from the camera, so custom computer vision algorithms can be written as well.
I guess it should be possible theoretically.
However, Ive tested it with some stuff in my HOUSE (running S8 and an app with unity and arcore)
and the problem is more or less that it refuses to even start tracking movable things like books and plates etc:
due to the feature points of the surrounding floor etc it always picks up on those first.
Edit: did some more testing and i Managed to get it to track a bed sheet, it does However not adjust to any movement. Meaning as of now the plane stays fixed allthough i saw some wobbling but i guess that Was because it tried to adjust the Positioning of the plane once it's original Feature points where moved.

ARKit and Unity - How can I detect the act of hitting the AR object by a real world object from the camera?

Think if someone in real life waved their hand and hit the 3D object in AR, how would I detect that? I basically want to know when something crosses over the AR object so I can know that something "hit" it and react.
Another example would be to place a virtual bottle on the table and then wave your hand in the air where the bottle is and then it gets knocked over.
Can this be done? If so how? I would prefer unity help but if this can only be done via Xcode and ARKit natively, I would be open to that as well.
ARKit does solve a ton of issues with AR and make them a breeze to work with. Your issue just isn't one of them.
As #Draco18s notes (and emphasizes well with the xkcd link 👍), you've perhaps unwittingly stepped into the domain of hairy computer vision problems. You have some building blocks to work with, though: ARKit provides pixel buffers for each video frame, and the projection matrix needed for you to work out what portion of the 2D image is overlaid by your virtual water bottle.
Deciding when to knock over the water bottle is then a problem of analyzing frame-to-frame differences over time in that region of the image. (And tracking that region's movement relative to the whole camera image, since the user probably isn't holding the device perfectly still.) The amount of of analysis required varies depending on the sophistication of effect you want... a simple pixel diff might work (for some value of "work"), or there might be existing machine learning models that you could put together with Vision and Core ML...
You should take a look at ManoMotion: https://www.manomotion.com/
They're working on this issue and suppose to release a solution in form of library soon.

iOS Compass vs independent developer versions

I can't seem to manage to make my compass application function anywhere near as smoothly as Apple's, and I haven't seen any compass applications from independent developers that don't lag and sometimes jump erratically like mine does.
Is there any open source that demonstrates how to achieve such fluidity? Accuracy isn't quite so important, but smooth animation is.
More general direction advice:
Easy: Apply a low pass filter to discard erratic readings.
Tricky: The compass updates slower than the gyroscope yaw, so you can measure the drift between compass updates to improve the response. Sample code.
Trickier: The proper way to fuse the gyroscope and the compass would be using a Kalman filter, but it isn't trivial. There is a talk about it at http://talkminer.com/viewtalk.jsp?videoid=C7JQ7Rpwn2k

How to use the VGA camera as a optical sensor?

I am designing an information kiosk which incorporates a mobile phone hidden inside the kiosk.
I wonder whether it would be possible to use the VGA camera of the phone as a sensor to detect when somebody is standing in front of the kiosk.
Which SW components (e.g. Java, APIs, bluetooth stack etc) would be required for a code to use the VGA camera for movement detection?
Obvious choice is to use face detection. But you would have to calibrate this to ensure that the face detected is close enough to the kiosk. May be using the relative size of the face in the picture. This could be done using opencv lib which is widely used. But as this kiosk would be deployed in places you would have little control of the lighting, there's a good chance of false positives and negatives. May be you also want to consider a proximity sensor in combination with face detection.
Depending on what platform is the information kiosk using the options would vary... But assuming there is linux somewhere underneath, you should take a look at OpenCV library. And in case it is of any use - here's a link to my funny experiment to get the 'nod-controlled interface' for reading the long web pages.
And speaking of false positives - or even worse - false negatives - in case of bad lighting or unusual angle the chances are pretty high. So you'd need to complement that by some fallback mechanism like onscreen button 'press here to start' which would be there by default, and then use the inactivity timeout alongside with the face detection to avoid having just one information input vector.
Another idea (depending on the light conditions), might be to measure the overall amount of light in the picture - natural light should be eliciting only slow changes, while the person walking close to the kiosk would cause rapid lighting change.
In j2me (java for mobile phones), you can use the mmapi (mobile media api) to capture the camera screen.
Most phones support this.
#Andrew's suggestion on OpenCV is good. There are a lot of motion detection projects. BUT, I would suggest adding a cheap CMOS camera rather than the mobile phone camera.

Resources