I am currently working on an iOS app that can take a picture programmatically using AVFoundation libraries like AVCaptureDevice through a custom button.
The new requirement is that the camera should automatically take a picture when the camera session detects something specific. For example, if the camera is open, and I line up an apple to fill a certain circle part of the capture screen, it should take the picture automatically. We can see this auto capture feature in some banking apps when you submit a mobile check deposit.
Does anyone know of existing libraries(open-source or proprietary) that can analyze images in real time while a user is taking a picture?
The first thing you are going to need to do is decide how you want to detect the apple. You can do this using shape detection, image recognition, or various other methods. This is important because you need to know the approach you want to take before you can identify the best way to implement it.
Once you know how you are going to identify the apple, the easiest way to do real-time image processing like this would be to use an existing augmented reality SDK. For example:
http://www.wikitude.com/products/wikitude-sdk/
http://artoolkit.org/
https://developer.vuforia.com/
If you are feeling really adventurous you could roll your own using AForge or a similar library. I have taken this approach in the past for basic shape detection projects.
Edit
The reason I suggest using an existing AR SDK is because generally they provide a lot of the glue between the camera feed and their API for you and it takes a lot of leg work out of the equation. Even though you won't be using any of the actual "augmentation" part of their SDKs, you can still take advantage of the detection part.
No matter what approach you take, you can think about it in the simplest terms of looking a picture, and figuring out if the item you want is in that picture. How do you decide? In most cases you look for a specific shape or pattern.
Related
I am currently playing a bit with ARKit. My goal is to detect a shelf and draw stuff onto it.
I did already find the ARReferenceImage and that basically works for a very, very simple prototype, but the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image). With that marker I would know the position of an edge and then I'd know the physical size of my shelf and know how to place stuff into it. So that would be ok, but I think small and simple markers will not work, right?
But ideally I would not need a marker at all.
I know that I can detect e.g. planes, but I want to detect the shelf itself. But as my shelf is open, it's not really a plane. Are there other possibilities to find an object using ARKit?
I know that my question is very vague, but maybe somebody could point me in the right direction. Or tell me if that's even possible with ARKit or if I need other tools? Like Unity?
There are several different possibilities for positioning content in augmented reality. They are called content anchors, and they are all subclasses of the ARAnchor class.
Image anchor
Using an image anchor, you would stick your reference image on a pre-determined spot on the shelf and position your 3D content relative to it.
the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image)
That's correct. The image needs to have enough visual detail for ARKit to track it. Something like a simple black and white checkerboard pattern doesn't work very well. A complex image does.
Object anchor
Using object anchors, you scan the shape of a 3D object ahead of time and bundle this data file with your app. When a user uses the app, ARKit will try to recognise this object and if it does, you can position your 3D content relative to it. Apple has some sample code for this if you want to try it out quickly.
Manually creating an anchor
Another option would be to enable ARKit plane detection, and have the user tap a point on the horizontal shelf. Then you perform a raycast to get the 3D coordinate of this point.
You can create an ARAnchor object using this coordinate, and add it to the ARSession.
Then you can again position your content relative to the anchor.
You could also implement a drag gesture to let the user fine-tune the position along the shelf's plane.
Conclusion
Which one of these placement options is best for you depends on the use case of your app. I hope this answer was useful :)
References
There are a lot of informative WWDC videos about ARKit. You could start off by watching this one: https://developer.apple.com/videos/play/wwdc2018/610
It is absolutely possible. If you do this in swift or Unity depends entirely on what you are comfortable working in.
Arkit calls them https://developer.apple.com/documentation/arkit/arobjectanchor. In other implementations they are often called mesh or model targets.
This Youtube video shows what you want to do in swift.
But objects like a shelf might be hard to recognize since their content often changes.
I'm working on an application where the concept is that you can 'select' objects before actually placing them. So what I wanted to do was have some low quality objects on a shelf or something like it. When the user selects the object he then can tap to place the high quality version of the object in his area for further viewing.
I was wondering if it's possible with vuforia. I wanted to use this platform since it works well from what I could tell and it's cross platform (The application needs to be for android and the HoloLens).
I have set up the basic application where you can place a capsule in the area. Now I wanted to automatically place the (in this case capsule) once vuforia has detected a ground plane. From what I could see the plane finder has events that go off when an input is detected, but I couldn't find an event that goes off when the ground plane is detected. Is it still possible with vuforia? I know it's doable with the HoloLens, but I would like to know if it's possible for android or other mobile devices. I really don't know where to start/look for so I hope someone can point me in the right direction.
Let me know if I need to include more information!
The Vuforia PlaneFinderBehaviour (see doc here) has the event OnAutomaticHitTest which fires every frame a ground plane is detected.
So you can use it to automatically spawn an object.
You have to add your method in the On Automatic Hit Test instead of the On Interactive Hit Test list of the "Plane Finder":
I've heard that vuforia fusion, does not yet support ARCore (it supports ARKit) so it uses an internal implementation to simulate ARCore functionality, and they are waiting for a final release of ARCore to support it. Many users reported that their objects move even when they use an ARCore supported device.
Is it possible to recognise light patterns on iOS?
Is there a native iOS SDK to do so?
Use case:
Detect light patterns (e.g. on / off) using smartphone camera
Background information:
Apple has acquired last year Metaio so I presume at some point we will have such SDK, but for now I presume that the best way to achieve this is by using third party SDK or using image capturing and processing the image (if the images are simple enough so that a simple algorithm can be applied).
You could take a look at Kudan AR. https://www.kudan.eu/
They currently offer a SDK for iOS, not yet for Android. Their tracking quality is phenomenally good. But, I do not know if it is appropriate to your goals. It would be best if you talk to them and ask if their tracking would fit your needs.
I want to track the relative position of a camera aimed at a computer screen.
I can’t control what is displayed on the computer screen but I can receive screen dumps whenever something changes on the screen. Those screen dumps can hopefully be used to find the screen when analyzing the video from the camera.
I see many videos on youtube for face, logo or single colored objects tracking using OpenCV but I’m unsure those methods would work finding and tracking a more detailed image like a screen dump.
Maybe Template Matching is the way to go? But I need to find the screen even at an angle.
Basically I don’t know where to begin and need help from people with experience in this field to find the best way for achieving what I want.
Thanks
Using feature matching should do the trick (Sift/SURF/ORB/...)
Is it possible to use Vuforia without a camera for image tracking?
Basically I would like a function I could call with an image as a indata parameter and coordinates of a image target as a result. Does that exist?
It is unfortunately not possible. I've been looking for such an option myself several times while working on a Moodstocks (image recognition SDK) / Vuforia mashup (see these 2 blog posts if you are interested in it), but the Vuforia SDK prevents the use of any other source than the camera.
I guess the main reason for this is that the camera management is fully handled internally by the Vuforia SDK, probably in order to make it easier to use as managing the camera by ourselves is at best a boring task (lines and lines of code to repeat in each project...), at worst a huge pain in the ass (especially on Android where there are sometimes devices than don't behave as expected).
By the way, it looks to me like the Vuforia SDK is not the best solution you can find for your use case: it is mainly an augmented-reality SDK, focussed on real-time tracking, which imply working with a camera stream... so using it to do "simple" image recognition looks really overkill!