Detecting a real world object using ARKit with iOS - ios

I am currently playing a bit with ARKit. My goal is to detect a shelf and draw stuff onto it.
I did already find the ARReferenceImage and that basically works for a very, very simple prototype, but the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image). With that marker I would know the position of an edge and then I'd know the physical size of my shelf and know how to place stuff into it. So that would be ok, but I think small and simple markers will not work, right?
But ideally I would not need a marker at all.
I know that I can detect e.g. planes, but I want to detect the shelf itself. But as my shelf is open, it's not really a plane. Are there other possibilities to find an object using ARKit?
I know that my question is very vague, but maybe somebody could point me in the right direction. Or tell me if that's even possible with ARKit or if I need other tools? Like Unity?

There are several different possibilities for positioning content in augmented reality. They are called content anchors, and they are all subclasses of the ARAnchor class.
Image anchor
Using an image anchor, you would stick your reference image on a pre-determined spot on the shelf and position your 3D content relative to it.
the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image)
That's correct. The image needs to have enough visual detail for ARKit to track it. Something like a simple black and white checkerboard pattern doesn't work very well. A complex image does.
Object anchor
Using object anchors, you scan the shape of a 3D object ahead of time and bundle this data file with your app. When a user uses the app, ARKit will try to recognise this object and if it does, you can position your 3D content relative to it. Apple has some sample code for this if you want to try it out quickly.
Manually creating an anchor
Another option would be to enable ARKit plane detection, and have the user tap a point on the horizontal shelf. Then you perform a raycast to get the 3D coordinate of this point.
You can create an ARAnchor object using this coordinate, and add it to the ARSession.
Then you can again position your content relative to the anchor.
You could also implement a drag gesture to let the user fine-tune the position along the shelf's plane.
Conclusion
Which one of these placement options is best for you depends on the use case of your app. I hope this answer was useful :)
References
There are a lot of informative WWDC videos about ARKit. You could start off by watching this one: https://developer.apple.com/videos/play/wwdc2018/610

It is absolutely possible. If you do this in swift or Unity depends entirely on what you are comfortable working in.
Arkit calls them https://developer.apple.com/documentation/arkit/arobjectanchor. In other implementations they are often called mesh or model targets.
This Youtube video shows what you want to do in swift.
But objects like a shelf might be hard to recognize since their content often changes.

Related

Placing objects below the ground in AR Quick Look on iOS

I am working on a project that will display objects below the ground using AR Quick Look. However, the AR mode seems to bring everything above the ground based on the bounding box of the objects in the scene.
I have tried using the USDZ directly and composing a simple scene in Reality Composer with the object or with a simple cube with the exact same result. AR preview mode in Reality Composer is showing the object below the ground or below an image anchor correctly. However, if I export the scene as a .reality file and open it in using AR Quick Look, it brings the object above the ground as well.
Is there a way to achieve showing an object below the detected horizontal plane or image (horizontal) using AR Quick Look?
This is still an issue a year later. I have submitted feedback to Apple. I suggest you do too. I have suggested adding a checkbox to keep Y axis persistent. My assumption is this behaves this way to prevent the object from colliding with the ground, but I don't think it's necessary. It's just a limitation right now.

How to align SCNScene to a physical table using ARKit?

I'm trying to find the best strategy to align a SCNScene to a physical table. Just like the ARKit app WWWFreeRivers.
Currently I'm just testing out to map a simple plane model, with the same dimensions as the table. If I draw out the plane that ARKit detects, I can see that the plane is not very accurate with the edges. They always go outside of the edges (image below).
So I can't really rely on that plane, to just place the model in the center of this. The model is not rotated correctly either (image below).
I had another idea to use the ARReferenceImage technique, to take a picture of the table top texture, and let ARKit find and match this "image" of the table. But even with wood grain texture, it wasn't enough data for ARKit to recognize it. And ARKit just fails if you have these errors. It doesn't even try to do a bad match.
How can I go about doing this?
Ideas I've had so far:
Take image of table and use ARImageReference feature to match it. This didn't work. Maybe if I add some more interesting feature points to the table, like some sort of QR codes in the corners.
Detect the plane, and then tap the four corners on the table to map out a square, and use this.
Do as the WWW app, just place the object randomly on the plane, and then let the user scale, move and rotate the model to give it correct placement.
Any more ideas? What do you think will be the best approach to this?
Two options I can think of you could use.
You could create an ARWorldMap (iOS12+ only) and use it instead of the ARImageReference, walk around the area while creating a map that subsequent ARKit Sessions will remember. You can experiment slightly as to how to fit your models within the four corners of the table (this is slightly tedious w/o much help from the SceneView editor). However, when you load the saved ARWorldMap and localized against it (just like the ARImageReference), your model should fit within the four corners of the table every time.
If you use something like Unity (and its ARKit plugin), it has much more powerful Editor tools (3D viewer/designer). There are some tools that can help you save the map just like ARWorldMap but then bring in details of the map into the editor so you can line things up right really easily. Placenote's Spatial Capture toolkit can help here. Placenote (iOS11+) creates its own "World Map" but it exposes the visual details in the Unity editor, making it easier to line things up and then localize against (Example). The map is also stored on a managed cloud from the get-go to make sharing across phones much easier.
P.S: Both these options require you to keep the environment generally static (not large lighting changes etc.), though this was a similar constraint to when using ARIMageReference.

Placing Virtual Object Behind the Real World Object

In ARKit for iOS. If you display a virtual item then it always comes before any real item. This means if I stand in front of the virtual item then I would still see the virtual item. How can I fix this scenario?
The bottle should be visible but it is cutting off.
You cannot achieve this with ARkit only. It offers no off the shelve solution for solving occlusion, which is a hard problem.
Ideally you'd know the depth of each pixel projected on the camera, and would use that to determine those that are in front and those that are behind. I would not try something with the feature points ARKit is exposing since 1) their position is innacurate 2) there's no way to know between two frames which feature point of frame A is which feature point in frame B. It's way to noisy data to do anything good.
You might be able to achieve something with third party options that'd process the captured image and understand depth or different depth levels in the scene, but I don't know any good solution. There's some SLAM technique that yields dense depth map like DTAM (https://www.kudan.eu/kudan-news/different-types-visual-slam-systems/) but that'd be redoing most of what arkit is doing. There might be other approaches that I'm not aware off. Apps like snap do this in their own way so it is possible!
So basically your question is to mapping the coordinate of the virtual item on real world coordinate system, in short, you want to see the virtual item blocked by the real item, and you can only see the virtual item once you pass the real item.
If so, you need to know the physical relations of each object in this environment, and then you need to know exactly where you are to decide if the virtual item is blocked.
It's not an intuitive way to fix this, however, it's the only way I can think of.
Cheers.
What you are trying to achieve is not easy.
You need to detect the parts of the real world that "should be visible" using some kind of image processing. Or maybe the ARKit feature points that have the depth information, then based on this you have to add "an invisible virtual object" that cuts the drawing of things behind it. This will represent your "real object" inside the "virtual world" so that the background (camera feed) remains visible in places where this invisible virtual object is present.

iOS Heavy image switching

I'm developing a app that will showcase products. One of the features of this app is that you will be able to "rotate" the product, using your finger/Pan-Gesture.
I was thinking in implementing this by taking photos of the product from different angles so when you "drag" the image, all I would have to do is switch the image according. If you drag a little, i switch only 1 image... if you drag a lot, i will switch them in cadence making it look like a movie... but i have a concerns and a probable solution:
Is this "performatic"? Since its a art/museum product showcase, the photos will be quite large in size/definition, and loading/switching when "dragged a lot" might be a problem because it would cause "flickering"... And the solution would be: instead of loading pic-by-pic i would put them all inside one massive sheet, and work through them as if they were a sprite...
Is that a good ideia? Or should I stick with the pic-by-pic rotation?
Edit 1: There`s a complicator: the user will be able to zoom in/out and to rotate the product in any axis (X, Y and Z)...
My personal opinion, I don't think this will work the way you hope or the performance and/or aesthetics will not be what you want.
1) Taking individuals shots that you then try to keyframe to based on touch events won't work well because you will have inevitable inconsistencies in 'framing' the shots such that the playback won't be smooth
2) The best way to do this, I suspect, will be to shoot it with video and shoot it with some sort of rig that allows you to keep the camera fixed while rotating the object
3) I'm pretty sure this is how most 'professional' grade product carousel type presentations work
4) Even then you will have more image frames than you need -- not sure whether you plan to embed the images files in app or download on demand -- but that is also a consideration in terms of how much downsampling you'll need to do to reduce frames/file size
Suggestion
Look at shooting these as video (somewhat like described above) and downsampling and removing excess frames using a video editor. Then you could use AVFoundation for playback and use your gestures to 'scrub' into the video frames. I worked on something like this for HTML playback at a large company and I can assure you it was done with video.
Alternatively, if video won't work for you. Your sprite sheet solution might work (consider using SpriteKit). But then keep in mind what I said about trying to keyframe one off camera shots together -- it just won't work well. Maybe a compromise would be to shoot static images but do so by fixing the camera and rotating the objects at very specific increments. That could work as well I suppose but you will need to be very careful about light and other atmospehrics. It doesn't take much variation at all to be detectable to the human eye causing the whole presentation to seem strange. Good luck.
A coder from my company did something like this before using 360 images of an object and it worked just great but it didn't have zoom. Maybe you could add zoom by adding a pinch gesture recognizer and placing the image view into a scroll view to zoom in on the static image.
This scenario sounds like what you really need is a simple 3D model loader library or write it in OpenGL yourself. But this pan and zoom behavior is really basic when you make that jump to 3D so it should be easy to find lots of examples.
All depends on your situation and time constraints :)

iOS augmented reality with compass and location

I'm trying to develop a mini "Around Me" like using camera, compass and location. I would like to display place's images on my screen.
For the moment I have my location and my orientation with compass. I would like to know how can I determine the position of the place I want to display.
Thanks for your help ;)
Once you have relative distance and bearing, which you can determine from two points in the same coordinate space using algorithms found on this page, figuring out where a known coordinate is with respect to a known viewpoint is basically a perspective projection, the math is outlined on this Wikipedia article. The rotation of the camera is given by the compass, and the tilt by the accelerometer (the position is of course, GPS).
I'm trying to find a better document - there are a couple of extra things to consider - like the camera parameters etc, but this is a good starting point.
If it's too involved (like if you're not comfortable with rotation matrices) we can break it right down to the simple trig.
The code in the iPhone ARKit project does this, and quite a bit more. While you may not be able to use their complete library, it is a great reference on the subject of augmented reality.
Check out 3DAR, it lets you add an AR view to a MKMapView app very easily. There's a video tutorial on this process, as well as some sample code, on the 3DAR site, www.3dar.us
You can create a location based AR app in Junaio. It's an AR browser. Free to use and deploy in (as long as it's not a custom app and in Junaio).

Resources