I've been researching in any forums about this problem that I'm facing, I believe be getting close to a fix and so, decided to ask in here for help and also to help any other one who needs this topic.
The problem involves that language in SKRouteAdvices. When retrieved through
SKRoutingService.sharedInstance().routeAdviceListWithDistanceFormat(.Metric)
an array of SKRouteAdvices was retrieved, but all of the advices were written in english, the voice was in portuguese but the .adviceInstruction was in english. I tried to set the advisorSettings (as I should anyway), it didn't work, but, for some unknown reason, when I set to TTS instead of pre-recorded audios, the advices were written portuguese but an weird voice (TSS) was in it instead the pre-recorded, as expected, actually. Then, tired of trying to find an obvious fix, decided to first do this, retrieve the portuguese advices, save in an array and then do it again but as did before to get the pre-recorded voice.
Turns out, the framework has some hidden problem with it, I tried a couple of different ways to get to it but the best I got was the result I wanted but with a 50% chance of crash, I really don't know why but sometimes it just did crash. So then I tried to do the TTS again but trying to getting the pre-recorded voices with the adviceInstruction property. It comes in portuguese and all the audios files are named in english so yes, and it doesn't work either.
Resuming everything: I need the SKRouteAdvices from my advices come in portuguese instruction and also in a pre-recorded voice. Any clue?
I give up trying to find a native way to get it, I followed Sylvia's suggest but I already did that before, I manage to get the result the I wanted by calling start navigation twice. In the first attempt I specify the advisorType (in SKAdvisorConfiguration in SKRoutingService.sharedInstance()) to .TextToSpeech, then, I grab the portuguese instructions and save in to a array and proceed to the second step, I repeat the configuration route and navigation with advisorType set to .AudioFiles.
With this strange combination I got what I wanted.
The text instructions are generated based on the config files (for full details see http://sdkblog.skobbler.com/advisor-support-text-to-speech-scout-audio/ and http://sdkblog.skobbler.com/advisor-support-text-to-speech-faq/)
The bottom line is that due to how the audio files (.mp3) are linked together the text advices generated when using the "audio" option will not be "human readable".
For TTS support the advices meant to be read by a voice, hence they are "human readable".
Right now you cannot have both "mp3" advices and human understandable text instructions at the same time.
Related
I am trying to implement the accessibility to my ios project.
Is there a way to correct the pronunciation of some specific words when the voice-over is turned on? For example, The correct pronunciation of 'speech' is [spiːtʃ], but I want the voice-over to read all the words 'speech' as same as 'speak' [spiːk] during my whole project.
I know there is one way that I can set the accessibility label of any UIElements that I want to change the pronunciation to 'speak'. However, some elements are dynamic. For example, we get the label text from the back-end, but we will never know when the label text will be 'speech'. If I get the words 'speech' from the back end, I would like to hear voice-over read it as 'speak'.
Therefore, I would like to change the setting for the voice-over. Every time, If the words are 'speech', the voice-over will read as 'speak'.
Can I do it?
Short answer.
Yes you can do it, but please do not.
Long Answer
Can I do it?
Yes, of course you can.
Simply fetch the data from the backend and do a find-replace on the string for any words you want spoken differently using a dictionary of words to replace, then add the new version of the string as the accessibility label.
SHOULD you do it?
Absolutely not.
Every time someone tries to "fix" pronunciation it ends up making things a lot worse.
I don't even understand why you would want screen reader users to hear "speak" whenever anyone else sees "speech", it does not make sense and is likely to break the meaning of sentences:
"I attended the speech given last night, it was very informative".
Would transform into:
"I attended the speak given last night, it was very informative"
Screen reader users are used to it.
A screen reader user is used to hearing things said differently (and incorrectly!), my guess is you have not been using a screen reader long enough to get used to the idiosyncrasies of screen reader speech.
Far from helping screen reader users you will actually end up making things worse.
I have only ever overridden screen reader default behaviour twice, once when it was a version number that was being read as a date and once when it was a password manager that read the password back and would try and read things as words.
Other than those very narrow examples I have not come across a reason to change things for a screen reader.
What about braille users?
You could change things because they don't sound right. But braille users also use screen readers and changing things for them could be very confusing (as per the example above of "speech").
What about best practices
"Give assistive technology users as similar an experience as possible to non assistive tech users". That is the number one guiding principle of accessibility, the second you change pronunciations and words, you potentially change the meaning of sentences and therefore offer a different experience.
Summing up
Anyway this is turning into a rant when it isn't meant to be (my apologies, I am just trying to get the point across as I answer similar questions to this quite often!), hopefully you get the idea, leave it alone and present the same info, I haven't even covered different speech synthesizers, language translation and more that using "unnatural" language can interfere with.
The easiest solution is to return a 2nd string from the backend that is used just for the accessibilityLabel.
If you need a bit more control, you can pass an AttributedString as the accessibilityLabel with a number of different options for controlling pronunication
https://medium.com/macoclock/ios-attributed-accessibility-labels-f54b8dcbf9fa
I'm planning to transcribe a speech where the language is unknown, so I am trying to detect the language spoken automatically with multiple language codes given, however, I can't seem to find an option to actually find out which language the transcription will be in.
I've looked through the dev page of the speech-to-text api, but I can't seem to find a way to output the language code of the transcribed text.
Anyone could help me with this?
Thank you.
In general, the language code is returned with the results. For example, see the sample code here, which shows how to retrieve the language code from the results.
However, see the issue mentioned here. The language code does not always get returned when multiple languages are specified. As reported in the comments, this is an issue with the Google Speech API, an issue which reported here.
I have the following issue, and I have found a few topics here talking about it but none of these is actually answering my question.
I'm pretty new with iOS development, I searched the apple documentation but didn't found anything useful
I need to get the audio Sample/Buffer/Stream from the headphone microphone, in a manipulable variable or something like that. Then push it back to the headphones. That I can hear my voice when I'm talking.
I found things about AVFoundation but nothing more.
I know it’s possible to do that but I did not find how to
Can anybody help me further ?
Notice that the language you are looking for is swift,so you might can finding something useful on EZAudio,even this project was deprecated at June 13,2016.You might want to check the example which called PassThrough in this project,this example has the function "Hearing the voice while talking",and this project was written with swift.Wish could help.
I'm thinking about implementing basic voice control for an iOS app. The app will have a dictionary with about 30 entries, where each entry is a first and last name. When the user speaks to the app, the app will need to select the correct name from the list of ~30.
One thing I'm not sure about: the list of names is defined by each user of the app. So every user will have a different set of names.
I'm wondering if there is a an open source library that is customizable on this level? My biggest concern is that I won't be able to let the user define the dictionary.
Any ideas on how this could be done?
Thanks in advance, and please forgive the vague question :)
#### ### ### ###
Update: I am aware of the OpenEars library. Can't find anything on their site on if they allow limited, user-defined dictionaries. I can see that an app developer can set a custom dictionary, but nothing on if the app's end user could do this. Thanks for the help!
OpenEars allows you to define your own vocabulary out of the box using http://www.politepix.com/openears/#LanguageModelGenerator_Class_Reference
You can ignore all words outside of the vocabulary you define by using the Rejecto plugin.
You can do something similar with Julius, but I'm told OpenEars has better acoustic models.
I have used Julius in the past it worked very well on a Linux machine.
Now for iOS, some guys creaceed have compiled it for our lovely platform and propose a SDK.
I have no clue on how good it is, but at least there is a trial version you could check. In my opinion for your purpose (1/30 possibilities) it should work pretty well.
I don't know if my approach to this is fundamentally wrong, but I'm struggling to get my head around a (seemingly trivial?!) localisation issue.
I want to display the title of a 'System' UITabBarItem (More, Favorites, Featured, etc...) in a navigation bar. But where do I get the string from? The strings file of the MainWindow.nib doesn't contain the string (I didn't expect it to) and reading the title of the TabBarItem returns nil, which is what stumped me.
I've been told, there's no way to achieve it and I'll just have to add my own localised string for the terms in question. But I simply don't (want to) believe that!! That's maybe easy enough in some languages, but looking up, say, "More" in already presents me with more than one possible word in some languages. I'm not happy about simply sending these words for translation either, because it still depends on the translator knowing exactly which term Apple uses. So am I missing something simple here? What do other people do?
Obviously, setting the system language on my test device and simply looking to see what titles the Tab Items have is another 'obvious' possibility. But I really have a problem with half baked workarounds like that. That'll work for most languages, but I'm really gonna have fun when it comes to Russian or Japanese.
I'm convinced there must be a more reliable way to do this. Surely there must be a .strings file somewhere in the SDK that has these strings defined?
Thanks in advance...
Rich
The simple and unfortunate answer is that aside from a very few standard elements (e.g. a Back button), you need to localize all strings yourself. Yes, UIKit has its own Localization.strings file but obviously that's outside of your app sandbox so you don't have access to it.
I filed a bug with Apple years ago about providing OS-level localization for common button titles, tab item labels, etc. That bug is still open but obviously they haven't done it yet (sorry, I don't have the radar # handy).