This question already has answers here:
iPhone App › Add voice recognition? [closed]
(4 answers)
API or SDK for speech to text(speech recognition ) iphone
(3 answers)
Closed 9 years ago.
I would like to convert spoken words into text so I can use nslinguistictagger in my app. How can I convert speech to text? What are the options? Does openears support voice to text conversion?
OpenEars developer here. Yes, OpenEars does speech recognition and text-to-speech. You need to define a language model or grammar for it containing your vocabulary, but it can be done automatically from an NSArray of word or phrase NSStrings, or a text corpus.
Openears will support free speech recognition and text-to-speech functionalities in offline mode.
They have FliteController Class Reference, which controls speech synthesis (TTS) in OpenEars.
They have done an excellent job in speech recognition area.
However, please note that it will detect only the words that you mentioned in vocabulary files.It iss good to work as offline mode to get the better performance.
#Halle: Correct me if I'm wrong.
You have a paid option, Dragon Dictation which is working as online engine.
or use VocalKit: Shim for Speech Recognition on iPhone
I would like to point out that , none of them are accurate than Siri (Siri SDK is not available yet).
Related
I'm trying to build a software that will identify the language being spoken.
My plan is to use Google's cloud speech to text to transcribe the speech, and put it through cloud translation api to detect the langauge of the transcription.
However, since speech to text requires language code to be set prior to transcribing, I was planning to run it multiple times with different sets of languages and compare the "confidence" value to find the most confident transcription, that will be put through to cloud translation api.
Would this be the ideal way? Or would there be any other possible options?
Maybe you can check the Detecting language spoken automatically page in google cloud speech documentation.
Currently i am using open ears to detect a phrase and it works pretty well, although i would like to recognize all words in the english language and add that to a text field. So I had two thoughts on how to approach this.
1) Somehow load the entire english dictionary into OpenEars.
(i don't think it is a good idea because they say from 2-300 words or something like that
2)Activate the native iOS voice recognition without deploying the keyboard.
I'm leaning towoards the second way if possible because i love the live recognition in iOS 8, it works flawlessly for me.
How do i recognize all words using one of the two methods (or a better way if you know)?
Thank you
The answer is that you can't do 1) or 2), at least not the way you want to. OpenEars won't handle the whole English dictionary, and you can't get iOS voice recognition without the keyboard widget. You might want to look into Dragon Dictation, which is the speech engine that Siri uses, or SILVIA. You'll have to pay for a license though.
I am designing an English dictionary-like app and using OpenEars TTS for pronunciation but the voice quality is not so good. Any suggestion to improve its sound quality?
If you are supporting iOS7 and above, you can consider using AVSpeechSynthesizer.
What you can do is contact them directly and state your problem. Here is the link to their contact site: http://www.politepix.com/contact/
The best way to get your question answered is in their forums and here is the link for that: http://www.politepix.com/forums/forum/openearsforum/
They also have a private support service but that will cost you some money but here is the link for that: http://www.politepix.com/shop/openears-support-incident/
Now i am confusing about Text To Speech Engine for iOS.
When i search in internet, i found two kinds of Text To Speech Engine for iOS.
These are iSpeech and Dragon Mobile SDK from Nuance.
i have to use Text To Speech for multiple languages and also have to use Speech To Text.
I want to know which engine is better and which is faster?
Thanks in advance.
I am working on Voice Recognition to Display the Phonemes and its wave form if possible using the built-in voice recognition on vista and windows 7 using Delphi2009. Other programming languages are welcome.
To get the wave form, you need to enable retained audio using SetAudioOptions:
m_pRecoCtxt->SetAudioOptions(SPAO_RETAIN_AUDIO, NULL, NULL);
Once you have the reco, you can get the audio using ISpRecoResult::GetAudio and do whatever processiong you need.
For phonemes, I'd look at the answers on your other question here.