I'm looking to implement face recognition feature and I see OpenCV is capable of it: https://github.com/Mjrovai/OpenCV-Face-Recognition
At the same time, I see many 3rd party face verification SDKs, like
http://kairos.com, http://www.neurotechnology.com/face-verification.html, http://ever.ai, etc. In general practise, what's the difference between OpenCV and 3rd party ones, if you only need offline face-recognition with no fancy addons, and which shall be used?
The example you linked with OpenCV uses a method (LBP) to perform the face-recognition that is outdated at the state of the art and that I think can hardly lead you to excellent results.
The SDKs you talk about are paid, and since they are obviously not open-source and I can not know what technologies they use.
If you prefer to implement a good face-recognition yourself, you have to use OpenCV only for the image capture / video stream part and then use something like TensorFlow/Keras/PyTorch for the deep learning part.
Related
I am interested in the Visual Intertial SLAM algorithm that is implemented in the ARKit SDK for motion tracking, that performs visual SLAM and fuses it with intertial data. I understand the algorithm and how tracking is performed.
Since I want to use my custom camera, and not an iphone, I was wondering if there is an equivalent open source implementation available already that performs the VI-SLAM + inertial data for tracking the object, with a comparable performance? I am not looking for SDKs that I can use as APIs, rather algorithm implementations that I can edit myself.
Apologies if this question should belong in another forum.
You can try a popular ARToolKit5. It is fast, intuitive and cross-platform. You can run it on macOS, iOS, Linux, Android or Windows. It was released in 2015 as a completely open source platform as LGPLv3 and later. There's also a link to the latest release of ARToolKitX.
There are many open source VISLAM on github. I recommend you to try VINS-Mono(https://github.com/HKUST-Aerial-Robotics/VINS-Mono). You can use your own camera to collect images and IMU data, or you can use public datasets.
Can OpenCV seamlessly interact with all cameras that comply with these standards
No it cannot. You need something called a GenTLProducer in order to interact with your camera. Normally your vendor's SDK comes with it. Alternatively, you can use the one from Baumer Here or from Stemmer Imaging Here.
Another option is to use harvesters, which is an open source project that aims to do this. Although you need a GenTLProducer for that as well.
I am newbie with drones. I would like to develop a program to manage a drone using opencv to fly indoor over a line.
I am searching a lot of languages but most all of them are GPS based. I saw there is an alternative which calls SLAM to detect the position using the sensors.
Well I have a line in the floor and a camera on my drone. I like mission planner but I am not quite sure if it is the best choice. I will be using Parrot AR, but I would like to use any drone.
So I would like to use mission planner but I am not sure if it is the best choice.
What would be the best SDK you would recommend me to use in order to manage the drone not using the GPS points but relative locations or SLAM?
Well, you have the Parrot API ,and a couple of wrappers in different languages. Node-AreDrone for nodeJs, PyArdrone for python, and there is a wrapper coded in C# which I have used AR.Drone. It has a good user interface which you can see the both cameras, record and replay the videos, control the drone by clicking on buttons, you can see the metrics and configuration of the drone and you have also a way to send commands in a queue. Because I love c# and the features I've mentioned you have already in a user interface, I prefer this. Most of them are quite the same as they use the Parrot API inside by sending udp messages. I couldn't try others, so, there are a lot, and anybody could tell me which one is the best. For mission planner I couldn't find a good solution for indoors. So, for anyone who is lost and do not know here to start as I was. I recommend to select the language you want and search for the corresponding wrapper. If you like c# as me, so AR.Drone is a good choice.
Also if you want to do something with OpenCV. Copterface is a good example. You could implement it in any language with OpenCV.
I'm working on an applicaion in Swift and I was thinking about a way to get Non-Speech sound recognition in my project.
I mean is there a way in which I can take in sound inputs and match them against some predefined sounds already incorporated in the project and if a match occurs, it should do some particular action?
Is there any way to do the above? I'm thinking breaking up the sounds and doing the checks, but can't seem to get any further than that.
My personal experience follows matt's comment above: requires serious technical knowledge.
There are several ways to do this, and one is typically as follows: extract some properties from the sound segment of interest (audio feature extraction), and classify this audio feature vector with some kind of machine learning technique. This typically requires some training phase where the machine learning technique was given some examples to learn what sounds you want to recognize (your predefined sounds) so that it can build a model from that data.
Without knowing what types of sounds you're aiming for to be recognized, maybe our C/C++ SDK available here might do the trick for you: http://www.samplesumo.com/percussive-sound-recognition
There's a technical demo on that page that you can download and try with your sounds. It's a C/C++ library, and there is a Mac, Windows and iOS version, so you should be able to integrate it with a Swift app on iOS. Maybe this will allow you to do what you need?
If you want to develop your own technology, you may want to start by finding and reading some scientific papers using the keywords "sound classification", "audio recognition", "machine listening", "audio feature classification", ...
Matt,
We've been developing a bunch of cool tools to speed up iOS development, specially in Swift. One of these tools is what we called TLSphinx: a Swift wrapper around Pocketsphinx which can perform speech recognition without the audio leaving the device.
I assume TLSphinx can help you solve your problem since it is a totally open source library. Search for it on Github ('TLSphinx') and you can also download our iOS app ('Tryolabs Mobile Showcase') and try the module live to see how it works.
Hope it is useful!
Best!
I have been working Augmented Reality for quite a few months. I have used third party tools like Unity/Vuforia to create augmented reality applications for android.
I would like to create my own framework in which I will create my own AR apps. Can someone guide me to right tutorials/links to achieve my target. On a higher level, my plan is to create an application which can recognize multiple markers and match it with cloud stored models.
That seems like a massive undertaking: model recognition is not an easy task. I recommend looking at OpenCV (which has some standard algorithms you can use as a starting point) and then looking at a good computer vision book (e.g., Richard Szeliski's book or Hartley and Zisserman).
But you are going to run into a host of practical problems. Consider that systems like Vuforia provide camera calibration data for most Android devices, and it's hard to do computer vision without it. Then, of course, there's efficiently managing the whole pipeline which (again) companies like Qualcomm and Metaio invest huge amounts of $$ in.
I'm working on a project that does framemarker tracking and I've started exporting bits of it out to a project I'm calling OpenAR. Right now I'm in the process of pulling out unpublishable pieces and making Vuforia and the OpenCV versions of marker tracking interchangeable. You're certainly welcome to check out the work as it progresses. You can see videos of some of the early work on my YouTube channel.
The hard work is improving performance to be as good as Vuforia.