AVSpeechSynthesizer High Quality Voices - ios

Is it possible to use the enhanced/high quality voices (Alex in the U.S.) with the speech synthesizer? I have downloaded the voices but find no way to tell the synthesizer to use it rather than the default voice.
Since voices are generally selected by BCP-47 codes and there is only on for US English, it appears there is no way to further differentiate voices. Am I missing something? (One would think Apple might have considered a need for different dialects, but I am not seeing it).
TIA.

Yes, possible to pick from the 2 that seem to be available on my system, like this:
class Speak {
let voices = AVSpeechSynthesisVoice.speechVoices()
let voiceSynth = AVSpeechSynthesizer()
var voiceToUse: AVSpeechSynthesisVoice?
init(){
for voice in voices {
if voice.name == "Samantha (Enhanced)" && voice.quality == .enhanced {
voiceToUse = voice
}
}
}
func sayThis(_ phrase: String){
let utterance = AVSpeechUtterance(string: phrase)
utterance.voice = voiceToUse
utterance.rate = 0.5
voiceSynth.speak(utterance)
}
}
Then, somewhere in your app, do something like this:
let voice = Speak()
voice.sayThis("I'm speaking better Seppo, now!")

This was a bug in the previous versions of iOS that the apps using the synthesiser weren't using the enhanced voices. This bug has been fixed in iOS10. iOS10 now uses the enhanced voices.

Related

Play audio buffers generated by AVSpeechSynthesizer directly

We have a requirement for audio processing on the output of AVSpeechSynthesizer. So we started with using the write method of AVSpeechSynthesizer class to apply processing on top. of it. What we currently have:
var synthesizer = AVSpeechSynthesizer()
var playerNode: AVAudioPlayerNode = AVAudioPlayerNode()
fun play(audioCue: String){
let utterance = AVSpeechUtterance(string: audioCue)
synthesizer.write(utterance, toBufferCallback: {[weak self] buffer in
// We do our processing including conversion from pcmFormatFloat16 format to pcmFormatFloat32 format which is supported by AVAudioPlayerNode
self.playerNode.scheduleBuffer(buffer as! AVAudioPCMBuffer, completionCallbackType: .dataPlayedBack)
}
}
All of it was working fine before iOS 16 but with iOS 16 we started getting this exception:
[AXTTSCommon] TTSPlaybackEnqueueFullAudioQueueBuffer: error -66686 enqueueing buffer
Not sure what this exception means exactly. So we are looking for a way of addressing this exception or may be a better way of playing the buffers.
UPDATE:
Created an empty project for testing and it turns out the write method if called with an empty bloc generates these logs:
Code I have used for Swift project :
let synth = AVSpeechSynthesizer()
let myUtterance = AVSpeechUtterance(string: message)
myUtterance.rate = 0.4
synth.speak(myUtterance)
Can move let synth = AVSpeechSynthesizer() out of this method and declare on top for this class and use.
Settings to enable for Xcode14 & iOS 16 : If you are using XCode14 and iOS16, it may be voices under spoken content is not downloaded and you will get an error on console saying identifier, source, content nil. All you need to do is, go to accessiblity in settings -> Spoken content -> Voices -> Select any language and download any profile. After this run ur voice and you will be able to hear the speech from passed text.
It is working for me now.

AVSpeechSynthesizer: how to display in a default player view

I use AVSpeechSynthesizer to play text books via audio.
private lazy var synthesizer: AVSpeechSynthesizer = {
let synthesizer = AVSpeechSynthesizer()
synthesizer.delegate = self
return synthesizer
}()
let utterance = AVSpeechUtterance(string: text)
utterance.voice = AVSpeechSynthesisVoice(
language: languageIdentifier(from: language)
)
synthesizer.speak(utterance)
I want to update information in iPhone's default player view (probably naming is wrong 🙏):
indicate playing Chapter with some text
enable next button to play the next chapter
How can I accomplish this?
I really don't think you want to hack your way through this.. But if you really do I would:
Listen to remote commands (UIApplication.sharedApplication().beginReceivingRemoteControlEvents(), see Apple Sample Project
Set your properties on MPNowPlayingInfoCenter: MPNowPlayingInfoCenter.default().nowPlayingInfo[MPMediaItemPropertyTitle] = "Title"
Implement the AVSpeechSynthesizerDelegate and try to map the delegate functions to playback states and estimate the playback progress using speechSynthesizer(_:willSpeakRangeOfSpeechString:utterance:) (idk if possible)
You might have to play with the usesApplicationAudioSession property of AVSpeechSynthesizer to have more control over the audio session (set categories etc.)

Use AVSpeechSynthesizer in background without stopping f.e. music app

I'd like to know how to properly let my iPhone speak one sentence while my app is in background but then return to whatever was playing before.
My question is quite similar to AVSpeechSynthesizer in background mode but again with the difference that I want to be able to "say something" while in background without having to stop Music that is playing. So while my AVSpeechSynthesizer is speaking, music should pause (or be a bit less loud) but then it should resume. Even when my app is currently in background.
What I am trying to archive is a spoken summary of tracking-stats while GPS-Tracking in my fitness app. And chances are that you are listening to music is quite high, and I don't want to disturb the user...
I found the answer myself...
The important part ist to configure the AVAudioSession with the .duckOthers option:
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(AVAudioSessionCategoryPlayback, with: .duckOthers)
This will make playback of f.e. music less loud but this would make it stay less loud even when speech is done. This is why you need to set a delegate for the AVSpeechSynthesizer and then handle it like so:
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
guard !synthesizer.isSpeaking else { return }
let audioSession = AVAudioSession.sharedInstance()
try? audioSession.setActive(false)
}
That way, music will continue with normal volume after speech is done.
Also, right before speaking, I activate my audioSession just to make sure (not sure if that would really be necessary, but since I do so, I have no more problems...)
For swift 3, import AVKit then add
try? AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayback)
for swift 4.2 / Xcode 10
unfortunately .duckOthers is no longer available; I managed to make it work like that:
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setCategory(AVAudioSession.Category.ambient, mode: .default)
Swift 5.0
let synthesizer = AVSpeechSynthesizer()
let synthesizerVoice = AVSpeechSynthesisVoice(language: "en-US")
let str = "String"
let utterance = AVSpeechUtterance(string: str)
let audioSession = AVAudioSession.sharedInstance()
try! audioSession.setCategory(
AVAudioSession.Category.playback,
options: AVAudioSession.CategoryOptions.mixWithOthers
)
utterance.rate = 0.5
utterance.voice = synthesizerVoice
synthesizer.speak(utterance)
Background music don't stop and your word sound play even phone in silent mode by this two line of code before usual text to speech codes :
let audioSession = AVAudioSession.sharedInstance()
try!audioSession.setCategory(AVAudioSessionCategoryPlayback, with: AVAudioSessionCategoryOptions.mixWithOthers)

How can I play sound system with text?

Do we have any ways to get sound from text?
For example, we have:
let str = "Hello English" .
I want to get sound system from that text.
As Ravy Chheng answered, this uses the built-in AVFoundation library to initialize basic text-to speak function. For more information, check out the documentation by Apple: https://developer.apple.com/reference/avfoundation/avspeechsynthesizer
import AVFoundation
func playSound(str: String) {
let speechSynthesizer = AVSpeechSynthesizer()
let speechUtterance = AVSpeechUtterance(string: str)
speechSynthesizer.speak(speechUtterance)
}

AVSpeechSynthesizer output as file?

AVSpeechSynthesizer has a fairly simple API, which doesn't have support for saving to an audio file built-in.
I'm wondering if there's a way around this - perhaps recording the output as it's played silently, for playback later? Or something more efficient.
This is finally possible, in iOS 13 AVSpeechSynthesizer now has write(_:toBufferCallback:):
let synthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: "test 123")
utterance.voice = AVSpeechSynthesisVoice(language: "en")
var output: AVAudioFile?
synthesizer.write(utterance) { (buffer: AVAudioBuffer) in
guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
fatalError("unknown buffer type: \(buffer)")
}
if pcmBuffer.frameLength == 0 {
// done
} else {
// append buffer to file
if output == nil {
output = AVAudioFile(
forWriting: URL(fileURLWithPath: "test.caf"),
settings: pcmBuffer.format.settings,
commonFormat: .pcmFormatInt16,
interleaved: false)
}
output?.write(from: pcmBuffer)
}
}
As of now AVSpeechSynthesizer does not support this . There in no way get the audio file using AVSpeechSynthesizer . I tried this few weeks ago for one of my apps and found out that it is not possible , Also nothing has changed for AVSpeechSynthesizer in iOS 8.
I too thought of recording the sound as it is being played , but there are so many flaws with that approach like user might be using headphones, the system sound might be low or mute , it might catch other external sound, so its not advisable to go with that approach.
You can use OSX to prepare AIFF files (or, maybe, some OSX-based service) via NSSpeechSynthesizer method
startSpeakingString:toURL:

Resources