We try to use build-in iOS text-to-speech tool for reading Chinese words in the app.
It's good in reading texts. But got problems reading separate words.
For example, we have character 还. It could be pronounced like "hái" with meaning "also, in addition"; and could be pronounced like "huàn" with meaning "to return".
In phrase 我还要还钱 (wǒ hái yào huàn qián) it pronounce 还 in both ways (correct).
In case of separate "还" iOS prefer to read it only like "hái". How to make it pronounce characters in the way we need it (if possible)?
As a quick solution you can cut required words from longer files and play them as audio instead of using TTS
Related
I would like to translate an English word to its phonetic transcription work. Only text convert. Like I have work hello how are you so this will be converted to /ˈheˈləʊ haʊ ə jʊ/. Any one know the API? Any demo link please share with me. .
Try the WikSpeak in sourceforge:
https://sourceforge.net/projects/wikspeak/
WikSpeak is a tool that allows non-native English speakers to analyze the correlation between the pronunciation and spelling of English words. This program is a simple and fast graphic interface which can retrieve the phonetic transcription ( IPA ) and the pronunciation of any English word, while avoiding the annoying process of browsing dictionaries. The most outstanding features of WikSpeak are the savings in time to determine the pronunciation of English words and the ease in understanding phonetic transcription.
WikSpeak is a highly recommended tool for anyone from the beginner to the advanced non-native English speaker.
After integrating voice over through accessibilityLabels, and testing the interaction alone, it was time to try turning on the voice. Fortunately, it worked perfectly well for english text... But wasn't so lucky with Arabic.
Apparently, voice over utters "unpronounceable" when it reaches Easten Arabic-Indic numerals:
١ ، ٢ ، ٣
It is really inefficient to keep listening to each accessibility label to make sure, so I thought there would be some sort of query we can do to the TTS engine, and write tests around that.
All I know after research is that the underlying TTS engine is AVSpeechSynthesis, but that doesn't seem to have anything of that sort.
The Mac OS speech synthesizer has a set of embedded commands that let you do things like change the pitch, speech rate, level of emphasis, etc. For example, you might use
That is [[emph +]]not[[emph -]] my dog!
To add emphasis to the word "not" in the phrase
That is not my dog!
Is there any such support in the iOS speech synthesizer? It looks like there is not, but I'm hoping against hope somebody knows of a way to do this.
As a follow-on question, is there a way to make global changes to the "Stock" voice you get for a given locale?" In the settings for Siri you can select the Language and country as well as the gender. The AVSpeechSynthesizer appears to only give you a single, semi-random gender for each language/country however. (For example the voice for en-US is female, en-GB is male, en-AU is female, with no apparent way to change it.)
I agree that it doesn't seem possible. From the docs, it seems Apple intends that you would create separate utterances and manually adjust the pitch/rate:
Because an utterance can control speech parameters, you can split text
into sections that require different parameters. For example, you can
emphasize a sentence by increasing the pitch and decreasing the rate
of that utterance relative to others, or you can introduce pauses
between sentences by putting each one into an utterance with a leading
or trailing delay. Because the speech synthesizer sends messages to
its delegate as it starts or finishes speaking an utterance, you can
create an utterance for each meaningful unit in a longer text in order
to be notified as its speech progresses.
I'm thinking to create a category extension to AVSpeechUtterance to parse embedded commands (as in your example) and automatically create separate utterances. If someone else has done this, or wants to help, please let me know. I'll update here.
Currently i am using open ears to detect a phrase and it works pretty well, although i would like to recognize all words in the english language and add that to a text field. So I had two thoughts on how to approach this.
1) Somehow load the entire english dictionary into OpenEars.
(i don't think it is a good idea because they say from 2-300 words or something like that
2)Activate the native iOS voice recognition without deploying the keyboard.
I'm leaning towoards the second way if possible because i love the live recognition in iOS 8, it works flawlessly for me.
How do i recognize all words using one of the two methods (or a better way if you know)?
Thank you
The answer is that you can't do 1) or 2), at least not the way you want to. OpenEars won't handle the whole English dictionary, and you can't get iOS voice recognition without the keyboard widget. You might want to look into Dragon Dictation, which is the speech engine that Siri uses, or SILVIA. You'll have to pay for a license though.
I am creating an iOS game in which I have to inform user about events in the game with voice, that you have moved one piece, 2 pieces or well done you have performed well.
The problem is that voices are in large amount and if I replace audio files for each voice the app size will grow very large.
Second option I have discovered is to use text-to-speech library. I have tried "OpenEars" but the issue is I want voice like cartoon character or bird like which is not available in any of open source text-to-speech libraries as far as I have searched.
Can anybody suggest me what is the better way to handle it or any text-to-speech framework with different voice capabilities as mentioned in above paragraph.
Thanks in advance.
VoiceForge offers different TTS voices.
http://www.voiceforge.com