Is it a good idea to use a single-board-computer in a UAV robot? - robotics

I'm not sure it's good or bad, the robot should have computer vision for SLAM. What's your idea?

It is a great idea to use a Single Board Computer and System on Modules for developing a UAV robot. In fact, it has already been successfully implemented before. I remember seeing a similar implementation of using a Toradex Colibri on Iris Carrier Board for an entry to the Embedded Design Challenge contest by Toradex. Here are the details of the project including the video, updates and source codes at the following link: http://www.challenge.toradex.com/projects/10099-tdxcopter

Yes, that's how we did it when I was in school (albeit nine years ago). You want to focus on algorithms, not learning to program an unfamiliar platform.
Assuming the "A" stands for aerial, don't invest in anything you don't want crashing at high speed. And mind the vibrations.

Related

Robotics library in Forth?

I have read the documentation for the Roboforth environment from STrobotics and recognized that this a nice way for programming a robot. What I missed is a sophisticated software library with predefined motion primitives. For example, for picking up a object, for regrasping or for changing a tool.
In other programming languages like Python or C++, a library is a convenient way for programming repetitive tasks and for storing expert knowledge into machine-readable files. Also a library is good way for not-so-talented programmers to get access on higher-level-functions. In my opinion Forth is the perfect language for implementing such an API, but I didn't find information about it. Where should I search? Are there any examples out there?
I am author of RoboForth, and you make a good point. I have approached the problem of starting off new users with videos on YouTube; see How to... (playlist with 6 items, e.g "ST Robotics How-to number 1 - getting started") which is a playlist covering basics and indeed tool changing.
I never wrote any starter programs, because the physical positions (coordinates) would be different from one user to the next, however I think it can be done, and I will do it. Thanks for the heads up.

Possible use case/real application for mobile distributed version of Tensorflow?

I'm developing this project where I'm trying to create a distributed version of Tensorflow (the actual open source version is single node) and where the cluster is entirely composed by mobile devices (e.g. smartphones).
In your opinion, what is a possible application or use case where this could be useful? Can you give me some example please?
I know that this is not a "standard" Stack Overflow question, but I didn't know where to post it (if you know a better place where to post it, please let me know it). Thanks so much for your help!
http://www.google.com.hk/search?q=teonsoflow+android
TensorFlow can be used for image identification and there is an example using the camera for Android.
There could be many distributed uses for this. Face recognition, 3D space construction from 2D images.
TensorFlow can be used for a chat bot. I am working towards using it for a personal assistant. The Ai on one phone could communicate with the Ai on other phones.
It could use vision and GPS to 'reserve' a lane for you on the road. Intelligent crowd planned roads and intersections would be safer.
I am also interested in using it for distributed mobile. Please contact me with my user name at gmail or Skype.
https://boinc.berkeley.edu
I think all my answers could run on individual phones with communication between them. If you want them to act like a cluster as #Yaroslav pointed out there is Seti#home and other projects running in the BOINC client.
TensorFlow could be combined with a game engine. You could have a proceduraly generated Ai learning augumented reality game generating the story as multiple players interact with it. I have seen research papers for each of these components.

"Sound" Recognition in Swift?

I'm working on an applicaion in Swift and I was thinking about a way to get Non-Speech sound recognition in my project.
I mean is there a way in which I can take in sound inputs and match them against some predefined sounds already incorporated in the project and if a match occurs, it should do some particular action?
Is there any way to do the above? I'm thinking breaking up the sounds and doing the checks, but can't seem to get any further than that.
My personal experience follows matt's comment above: requires serious technical knowledge.
There are several ways to do this, and one is typically as follows: extract some properties from the sound segment of interest (audio feature extraction), and classify this audio feature vector with some kind of machine learning technique. This typically requires some training phase where the machine learning technique was given some examples to learn what sounds you want to recognize (your predefined sounds) so that it can build a model from that data.
Without knowing what types of sounds you're aiming for to be recognized, maybe our C/C++ SDK available here might do the trick for you: http://www.samplesumo.com/percussive-sound-recognition
There's a technical demo on that page that you can download and try with your sounds. It's a C/C++ library, and there is a Mac, Windows and iOS version, so you should be able to integrate it with a Swift app on iOS. Maybe this will allow you to do what you need?
If you want to develop your own technology, you may want to start by finding and reading some scientific papers using the keywords "sound classification", "audio recognition", "machine listening", "audio feature classification", ...
Matt,
We've been developing a bunch of cool tools to speed up iOS development, specially in Swift. One of these tools is what we called TLSphinx: a Swift wrapper around Pocketsphinx which can perform speech recognition without the audio leaving the device.
I assume TLSphinx can help you solve your problem since it is a totally open source library. Search for it on Github ('TLSphinx') and you can also download our iOS app ('Tryolabs Mobile Showcase') and try the module live to see how it works.
Hope it is useful!
Best!

How to start with Augmented reality to create my own framework (Not AR App)

I have been working Augmented Reality for quite a few months. I have used third party tools like Unity/Vuforia to create augmented reality applications for android.
I would like to create my own framework in which I will create my own AR apps. Can someone guide me to right tutorials/links to achieve my target. On a higher level, my plan is to create an application which can recognize multiple markers and match it with cloud stored models.
That seems like a massive undertaking: model recognition is not an easy task. I recommend looking at OpenCV (which has some standard algorithms you can use as a starting point) and then looking at a good computer vision book (e.g., Richard Szeliski's book or Hartley and Zisserman).
But you are going to run into a host of practical problems. Consider that systems like Vuforia provide camera calibration data for most Android devices, and it's hard to do computer vision without it. Then, of course, there's efficiently managing the whole pipeline which (again) companies like Qualcomm and Metaio invest huge amounts of $$ in.
I'm working on a project that does framemarker tracking and I've started exporting bits of it out to a project I'm calling OpenAR. Right now I'm in the process of pulling out unpublishable pieces and making Vuforia and the OpenCV versions of marker tracking interchangeable. You're certainly welcome to check out the work as it progresses. You can see videos of some of the early work on my YouTube channel.
The hard work is improving performance to be as good as Vuforia.

iOS / C: Algorithm to detect phonemes

I am searching for an algorithm to determine whether realtime audio input matches one of 144 given (and comfortably distinct) phoneme-pairs.
Preferably the lowest level that does the job.
I'm developing radical / experimental musical training software for iPhone / iPad.
My musical system comprises 12 consonant phonemes and 12 vowel phonemes, demonstrated here. That makes 144 possible phoneme pairs. The student has to sing the correct phoneme pair 'laa duu bee' etc in response to visual stimulus.
I have done a lot of research into this, it looks like my best bet may be to use one of the iOS Sphinx wrappers ( iPhone App › Add voice recognition? is the best source of information I have found ). However, I can't see how I would adapt such a package, can anyone with experience using one of these technologies give a basic rundown of the steps that would be required?
Would training be necessary by the user? I would have thought not, as it is such an elementary task, compared with full language models of thousands of words and far greater and more subtle phoneme base. However, it would be acceptable (not ideal) to have the user train 12 phoneme pairs: { consonant1+vowel1, consonant2+vowel2, ..., consonant12+vowel12 }. The full 144 would be too burdensome.
Is there a simpler approach? I feel like using a fully featured continuous speech recogniser is using a sledgehammer to crack a nut. It would be far more elegant to use the minimum technology that would solve the problem.
So really I'm hunting for any open source software that recognises phonemes.
PS I need a solution which runs pretty much real-time. so even as they are singing the note, firstly it blinks on to illustrate that it picked up the phoneme pair that was sung, and then it glows to illustrate whether they are singing the correct note pitch
If you are looking for a phone-level open source recogniser, then I would recommend HTK. Very good documentation is available with this tool in the form of the HTK Book. It also contains an entire chapter dedicated to building a phone level real-time speech recogniser. From your problem statement above, it seems to me like you might be able to re-work that example into your own solution. Possible pitfalls:
Since you want to do a phone level recogniser, the data needed to train the phone models would be very high. Also, your training database should be balanced in terms of distribution of the phones.
Building a speaker-independent system would require data from more than one speaker. And lots of that too.
Since this is open-source, you should also check into the licensing info for any additional details about shipping the code. A good alternative would be to use the on-phone recorder and then have the recorded waveform sent over a data channel to a server for the recognition, pretty much something like what google does.
I have a little bit of experience with this type of signal processing, and I would say that this is probably not the type of finite question that can be answered definitively.
One thing worth noting is that although you may restrict the phonemes you are interested in, the possibility space remains the same (i.e. infinite-ish). User training might help the algorithms along a bit, but useful training takes quite a bit of time and it seems you are averse to too much of that.
Using Sphinx is probably a great start on this problem. I haven't gotten very far in the library myself, but my guess is that you'll be working with its source code yourself to get exactly what you want. (Hooray for open source!)
...using a sledgehammer to crack a nut.
I wouldn't label your problem a nut, I'd say it's more like a beast. It may be a different beast than natural language speech recognition, but it is still a beast.
All the best with your problem solving.
Not sure if this would help: check out OpenEars' LanguageModelGenerator. OpenEars uses Sphinx and other libraries.
http://www.hfink.eu/matchbox
This page links to both YouTube video demo and github source.
I'm guessing it would still be a lot of work to mould it into the shape I'm after, but is also definitely does do a lot of the work.

Resources