I'm doing an app which converts from speech to text. I have googled and find that google speech api is a google choice. Now I meet a question: When user speak to ios device, how can I capture the audio file? does any Frameworks or APIs should be introduced? And what's the type of raw audio file, WAV or MP3? Thank you.
Why don't you take a look at some of the existing StackOverflow questions on this subject. Try Speech to text Conversion.? or What is the current best speech recognition API for ios to match few keywords?
Related
Is it possible to convert mp3 files into text without playing it using the microphone, for example, when listening to an audiobook with the mobile device? I was looking for relevant API in IBM Watson but can't find the solution.
Thereis no good/direct way to grab the audio output on android.
Record Android Audio Output
For Speech to Text you could use the Google API
Although if you have the mp3, it should be no problem to convert it to text with Google API.
Take a look here for that.
If I use the audio of one good YouTube video as input for the Google Could Speech API, would you say that I will get the "same" transcript as the one automatically provided by YouTube?
If you're interested on how Youtube automatic captions work, read their blog Automatic captions in YouTube:
To help address this challenge, we've combined Google's automatic
speech recognition (ASR) technology with the YouTube caption system to
offer automatic captions, or auto-caps for short. Auto-caps use the
same voice recognition algorithms in Google Voice to automatically
generate captions for video. The captions will not always be perfect
(check out the video below for an amusing example), but even when
they're off, they can still be helpful—and the technology will
continue to improve with time.
Credits to this Quora post.
I wanna use Google Voice Service by not microphone but video file.
for example, A Video File is playing on my computer and Google Speech Recognition Program is recognizing the video's Audio stream.
ex) Auto caption function of Youtube.
How can I use G.S.R??
This is a great question, Google does provide a way of doing that through the Web Speech API. Here's a link to an example usage, and a demo site from Google here.
However, you would have to extract the audio from the video first and then feed the audio to the API.
There's also the Cloud Speech API, which is free up to a certain point. It can be found here.
I am working on speech recognition functionality for which I used Nuance Dragon Speech recognition SDK.
I need the audio file which is recorded while speech recognition.
Is it possible to get this file?
If Yes, then How?
how to make Vlingo like application?
is there any api that can be used for making apps for ios?
Please provide some guide line or any tutorials or an help or comment will be appreciated
Thanks in advance
i also like to know how the talking ben the dog and talking tom cat is working as we talk it repeats it in funny voice. how it is possible?
For Speech Recognition on iOS, there have been many similar questions. Please see Speech to text Conversion.? or Text-to-speech (voice generation) and speech-to-text (voice recognition) APIs? or Speech recognition framework for iOS that supports Spanish or What is the current best speech recognition API for ios to match few keywords?
For the recognition of text you can use:
https://bitbucket.org/sfoster/iphone-tts/src
For speech recognition testing to see here
http://www.politepix.com/openears