I would like to be able to test which text-to-speech voices are available for my iOS app to use with AVSpeechSynthesis. It is easy to generate a list of the installed voices, but Apple makes some of them are off-limits for use by apps, and I would like to know which.
For example, consider the following test code (swift 5.1):
import AVFoundation
...
func voiceTest() {
let speechSynthesizer = AVSpeechSynthesizer()
let voices = AVSpeechSynthesisVoice.speechVoices()
for voice in voices where voice.language == "en-US" {
print("\(voice.language) - \(voice.name) - \(voice.quality.rawValue) [\(voice.identifier)]")
let phrase = "The voice you're now listening to is the one called \(voice.name)."
let utterance = AVSpeechUtterance(string: phrase)
utterance.voice = voice
speechSynthesizer.speak(utterance)
}
}
When I call voiceTest(), the console output is this:
en-US - Nicky (Enhanced) - 2 [com.apple.ttsbundle.siri_female_en-US_premium]
en-US - Aaron - 1 [com.apple.ttsbundle.siri_male_en-US_compact]
en-US - Fred - 1 [com.apple.speech.synthesis.voice.Fred]
en-US - Nicky - 1 [com.apple.ttsbundle.siri_female_en-US_compact]
en-US - Samantha - 1 [com.apple.ttsbundle.Samantha-compact]
en-US - Alex - 2 [com.apple.speech.voice.Alex]
Some of the voices speak in their actual voice, whereas some of them speak in the default voice instead. In my case both Nicky (com.apple.ttsbundle.siri_female_en-US_premium) and Alex (com.apple.speech.voice.Alex) are listed as high quality but sound instead like the low quality default, Samantha, when selected.
I know that Apple has said that the Siri voices are not available for use in third party apps. When I manually load Samantha (High Quality) on my iPhone via Settings, it appears in the list and I can use it. Perhaps Alex is just the high-quality male Siri voice, even though Aaron would seem to be the low-quality Siri voice based on its identifier (com.apple.ttsbundle.siri_male_en-US_compact)? And that's why Alex and Nicky are the only two to be unavailable? So that if I have my app specifically exclude those it will generate the true list of available voices? It would be nice to have some clarity.
I've been looking for a way to programmatically use Siri's nice sounding voice, such as English Siri Male (United States), and quickly discovered it is not possible using public Speech API even though the voice can be selected in System Preferences.
To answer your question, there are at least two other ways of finding available voices in addition to your code example.
Using defaults command
defaults read com.apple.speech.voice.prefs > speech_prefs.txt
To find info on voice currently selected in System Preference, look for SelectedVoiceName in speech_prefs.txt.
For example, for English Siri Male (United States), this will be SelectedVoiceName = "Aaron Siri";.
Now, by further searching for aaron in speech_prefs.txt, you will find the following:
"VOICEID:com.apple.speech.synthesis.voice.custom.siri.aaron.premium_1" = {
BundleIdentifier = "com.apple.speech.synthesis.voice.custom.siri.aaron.premium";
I tried both of these strings when initializing voice, but got error saying voice is not found.
Looking for voice directories
There seems to be three locations:
/System/Library/Speech/Voices
,
/Library/Speech/Voices
and
~/Library/Speech/Voices
The third one seems to be a location for custom voices.
Each voice has its own directory.
If you compare Info.plist files of some programmatically available and programmatically unavailable voices, you will see that both have different structure. For example, the programmatically unavailable voice lacks some attributes that correspond to Speech API, such as VoiceSupportedCharacters. I believe this is because some voices are of the older generation and some are newer.
P.S.
Not directly relevant to your question, but just FYI: I'm still looking for a solution to use Siri's voice programmatically. One idea is to make a copy of the voice directory and play with its Info.plist. The other idea is to automate MacOS UI to trigger text-to-speech conversion through simulating key press bound to Speak selected text when the key is pressed option in System Preferences / Accessibility / Speech and then recording the audio.
I'd appreciate if anyone can share other ideas.
Related
Suppose I'm making an app that a user can install several little interactive experiences (meditations) onto.
For convenience, I'd like my users to be able to start one by saying: “Hey Siri, start Beach Sunset in Meditations.”
Because of reasons, it makes sense for users to perform this action by voice, without ever first having interacted with Beach Sunset in the iOS app. (They may for example already “own” it through my service's web app.)
That is to say: I want a voice action like “Hey Siri, start Beach Sunset in Meditations” to work even without the user setting up a Shortcut for it first, or me “donating” actions for it.
Is that possible? (I feel like many of the default apps expose similar behavior, but maybe they're special.) If not, what is the next best thing I can do?
Are "donations" necessary for Siri to be aware of my app's voice actions, or are they simply a mechanism for hinting and predicting user behavior?
Are "shortcuts" necessary for Siri to be aware of my app's voice actions, or are they simply a mechanism for user phrase customization?
I've never added Siri support to an iOS app, but it seems “parameters” have gotten a lot more powerful in iOS 13. This answer suggests something similar wasn't possible in iOS 12, but I think it's also doing something somewhat different (I want to launch the app; they want to “create an object” presumably just using Intent UI. I don't know if this matters.)
What I've done
I've defined a custom intent in the Start category (LaunchMeditation) with a single parameter (meditationName).
I considered the standard Media intents, but the media here is interactive and not strictly audio/video, and I don't want to get in trouble.
I've added an Intents Extension to my app, and written a rudimentary test "handler" that just tries to pass the meditation name on to the app:
#interface IntentHandler () <LaunchMeditationIntentHandling>
#end
#implementation IntentHandler
- (id)handlerForIntent:(INIntent *)intent { return self; }
- (void)handleLaunchMeditation:(nonnull LaunchMeditationIntent *)intent
completion:(nonnull void (^)(LaunchMeditationIntentResponse * _Nonnull))completion {
// XXX: Maybe activity can just be nil?
NSUserActivity *activity = [[NSUserActivity alloc] initWithActivityType:#"com.example.meditations.activity.launch"];
activity.title = [NSString stringWithFormat:#"Launch %# in Meditations", intent.meditationName]; // Do I need this?
activity.userInfo = #{#"meditationName": intent.meditationName};
completion([[LaunchMeditationIntentResponse alloc] initWithCode:LaunchMeditationIntentResponseCodeContinueInApp
userActivity:activity]);
}
- (void)resolveMeditationNameForLaunchMeditation:(nonnull LaunchMeditationIntent *)intent
withCompletion:(nonnull void (^)(INStringResolutionResult * _Nonnull))completion {
completion([INStringResolutionResult successWithResolvedString:intent.meditationName]);
}
#end
When I test the Intents Extension, I can now make a Shortcut for it in Shortcuts, set its parameter, give it a name (like “Beach time”), and launch it by telling Siri that name — which is everything I don't want users to have to do.
Other than that, Siri responds with
Meditations hasn't added support for that with Siri.
…no matter how I phrase my request to start Beach Sunset. That “hasn't added support” sounds agonizingly much like there's simply something I'm missing.
I'll try to briefly answer all of your questions.
You can't create a Siri action without donating actions. Once you donate your actions they are not registered to Siri either. Users must create a Shortcut to be able to use them.
The next best thing you can do is to inform your users about your Siri Shortcuts. To do this you can show a pop-up or inform your new users on onboarding screens. The good part is you can redirect your users to the "Creating Siri Shortcut" screen via this code which you can trigger by button click or tap gesture.
let shortcut = INShortcut(userActivity: shortcutActivity!) // shortcutActivity is your donated activity.
let vc = INUIAddVoiceShortcutViewController(shortcut: shortcut)
vc.delegate = self // INUIAddVoiceShortcutViewControllerDelegate
self.present(vc, animated: true, completion: nil)
Shortcuts are necessary for Siri to be aware of you implementation.
As far as I know intent domains help you specify more parameters for your Siri shortcuts. Which enables you to create more Siri interactions.
Apple also promotes Siri Shortcuts of commonly used apps. If your users are using your app in a regular basis or more often than others they might see a Siri Shortcut Suggestion in their home screen. Similar to this one.
I also think it would be great to donate Siri Shortcuts without any user action but there would be certain problems such as:
What if two or more different apps uses the same phrase for a Siri Shortcut?
How will Siri distinguish an unregistered command from a simple conversation? For example if someone created a shortcut with the phrase "Hi Siri".
Even if you donate an action with a certain phrase Siri must learn how it's user pronounces that certain phrase.
These may cause a lot of harm than good thus I think Apple choose the current way. Hope these answers your questions.
In Apple's iOS 13 feature list page they have the following blurb:
Image Capture API
The Image Capture API allows developers to leverage the Camera
Connection Kit to import photos directly into their apps.
I've been looking but I can't seem to find any actual documentation about this change, and where it exists in the API. I also remember hearing a second or two talk about it in the keynote/state of the union in WWDC 19, but again no details in any session I've found so far.
It seems like you would be able to plug in a camera or it's SD card to the USB-C/Lightning port on the iOS device and be able to access that from within a 3rd party app. I know you can import to the system photo library, but that has been around for years. I also know about ExternalAccessory framework for MiFi hardware, but I don't see any significant changes to that, and it doesn't seem to have the described functionality exposed.
I do see that UIDocumentPicker can be shown and it allows the user to select a location that may be on a connected USB device. While that could work, it's not camera specific and would be quite error prone, if the user doesn't select a valid camera location.
Anybody know where I can find more info about this change or how you can programmatically access the camera's filesystem? The camera will have the standard camera folder structure DCIM and stuff, so it is recognized as a camera filesystem by many Mac apps.
You're looking for the ImageCaptureCore framework. This is the same framework that exists on macOS for importing from SD Cards and Cameras. It is now available in iOS 13.2.
Update:
The ImageCaptureCore API is now working as of iOS 13.2.
However, be warned that as of iOS/iPadOS 13.1 Beta 3 (17A5837a) I have not been able to get it working yet (reported to Apple FB6799036). It is now listed with an asterisk on the iPadOS Features page indicating that it will be "Coming later this year".
I'm able to start an ICDeviceBrowser, but I see permissions errors when a device is connected and don't get any delegate messages. So there may be some permission or entitlement that is needed before it starts working.
Unfortunately there is no documentation or sample code (even for macOS) on Apple's developer site. But the framework does exist in the iOS 13 SDK and you can look at the header files there.
We use this framework in our macOS app and using just the headers to figure things out isn't too bad. You'd start by creating an ICDeviceBrowser (ICDeviceBrowser.h), setting its delegate, and then starting the browser:
#interface CameraManager() : NSObject <ICDeviceBrowserDelegate>
{
ICDeviceBrowser* _deviceBrowser;
}
#end
#implementation CameraManager
- (id) init
{
self = [super init];
_deviceBrowser = [[ICDeviceBrowser alloc] init];
_deviceBrowser.delegate = self;
[_deviceBrowser start];
return self;
}
...
#end
You should then start receiving delegate messages when a camera device is connected:
- (void)deviceBrowser:(ICDeviceBrowser*)browser didAddDevice:(ICDevice*)addedDevice moreComing:(BOOL)moreComing;
- (void)deviceBrowser:(ICDeviceBrowser*)browser didRemoveDevice:(ICDevice*)removedDevice moreGoing:(BOOL)moreGoing;
When you get a didAddDevice: message you'll then want to use the ICDevice (ICDevice.h) and ICCameraDevice (ICCameraDevice.h) APIs to set a delegate and start a session. Once the session has started you'll start receiving delegate messages:
- (void)deviceBrowser:(ICDeviceBrowser*)browser didAddDevice:(ICDevice*)addedDevice moreComing:(BOOL)moreComing
{
if ((addedDevice.type & ICDeviceTypeMaskCamera) == ICDeviceTypeCamera)
{
ICCameraDevice* camera = (ICCameraDevice *) addedDevice;
camera.delegate = self;
[camera requestOpenSession];
// probably want to save 'camera' to a member variable
}
}
You can use the delegate method:
- (void)cameraDevice:(nonnull ICCameraDevice *)camera
didAddItems:(nonnull NSArray<ICCameraItem *> *)items;
To get a list of items as they are enumerated by the API or wait for:
- (void)deviceDidBecomeReadyWithCompleteContentCatalog:(ICDevice*)device;
And then use the .contents property on the ICCameraDevice to get all of the contents.
From there you can use the ICCameraDevice to request thumbnails, metadata, and to download specific files. I'll leave that as an exercise to the reader.
As I mentioned above this doesn't seem to be working in iOS/iPadOS 13.1 Beta 3. Hopefully this will all start working soon as I'd really like to start testing it myself.
This is now working in iOS 13.2.
I am writing an app that includes text-to-speech using AVSpeechSynthesizer. The code for generating the utterance and using the speech synthesizer has been working fine.
let utterance = AVSpeechUtterance(string: text)
utterance.voice = currentVoice
speechSynthesizer.speak(utterance)
Now with iOS 11, I want to match the voice to the one selected by the user in the phone's Settings app, but I do not see any way to get that setting.
I have tried getting the list of installed voices and looking for one that has a quality of .enhanced, but sometimes there is no enhanced voice installed, and even when there is, it may or may not be the voice selected by the user in the Settings app.
static var enhanced: AVSpeechSynthesisVoice? {
for voice in AVSpeechSynthesisVoice.speechVoices() {
if voice.quality == .enhanced {
return voice
}
}
return nil
}
The questions are twofold:
How can I determine which voice has been selected by the user in the Setting app?
Why on some iOS 11 phones that are using the new Siri voice am I not finding an "enhanced" voice installed?
I suppose if there was a method available for selecting the same voice as in the Settings app, it'd be shown on the documentation for class AVSpeechSynthesisVoice under the Finding Voices topic. Jumping to the definition in code of AVSpeechSynthesisVoice, I couldn’t find any different methods to retrieve voices.
Here's my workaround on getting an enhanced voice over for the app I am working on:
Enhanced versions of voices are probably not present in new iOS devices by default in order to save storage. Iterating thru available voices on my brand new iPhone, I only found Default quality voices such as: [AVSpeechSynthesisVoice 0x1c4e11cf0] Language: en-US, Name: Samantha, Quality: Default [com.apple.ttsbundle.Samantha-compact]
I found this article on how to enable additional voice over voices and downloaded the one named “Samantha (Enhanced)” among them. Checking the list of available voices again, I noticed the following addition:
[AVSpeechSynthesisVoice 0x1c4c03060] Language: en-US, Name: Samantha (Enhanced), Quality: Enhanced [com.apple.ttsbundle.Samantha-premium]
As of now I was able to select an enhanced language on Xcode. Given that the AVSpeechSynthesisVoice.currentLanguageCode() method exposes the currently selected language, ran the following code to make a selection of the first enhanced voice I could find. If no enhanced version was available I’d just pick the available default (the code below is for a VoiceOver custom class I am creating to handle all speeches in my app. The piece below updates its voice variable).
var voice: AVSpeechSynthesisVoice!
for availableVoice in AVSpeechSynthesisVoice.speechVoices(){
if ((availableVoice.language == AVSpeechSynthesisVoice.currentLanguageCode()) &&
(availableVoice.quality == AVSpeechSynthesisVoiceQuality.enhanced)){ // If you have found the enhanced version of the currently selected language voice amongst your available voices... Usually there's only one selected.
self.voice = availableVoice
print("\(availableVoice.name) selected as voice for uttering speeches. Quality: \(availableVoice.quality.rawValue)")
}
}
if let selectedVoice = self.voice { // if sucessfully unwrapped, the previous routine was able to identify one of the enhanced voices
print("The following voice identifier has been loaded: ",selectedVoice.identifier)
} else {
self.voice = AVSpeechSynthesisVoice(language: AVSpeechSynthesisVoice.currentLanguageCode()) // load any of the voices that matches the current language selection for the device in case no enhanced voice has been found.
}
I am also hoping Apple will expose a method to directly load the selected language, but I hope this work around can serve you in the meantime. I guess Siri’s enhanced voice is downloaded on the go, so maybe this is the reason it takes so long to answer my voice commands :)
Best regards.
It looks like the new Siri voice in iOS 11 isn't part of the AVSpeechSynthesis API, and isn't available to developers.
In macOS 10.13 High Sierra (which also gets the new voice), there seems to be a new SiriTTS framework that's probably related to this functionality, but it's in PrivateFrameworks so it doesn't have a developer API.
I'll try to provide a more detailed answer. AVSpeechSynthesizer cannot use the Siri voice. Apple has locked this voice to ensure privacy as the malicious app could impersonate Siri and get private information that way.
Apple hasn't changed this for years, but there is ongoing initiative regarding this. We already know that there is a solution to access privacy sensitive features in the iOS using the permissions, and there is no reason why Siri voice couldn't be accessed with user permission. You may vote for this to happen using this petition and with some hope Apple may implement that in the future: https://www.change.org/p/apple-apple-please-allow-3rd-party-apps-to-use-siri-voices-for-improved-accessibility
I'm looking for recommendations for an iOS barcode scanner app. Specifically for iPad which will support a custom URL callback to enable the app to be launched from a web browser.
Additionally, it needs to support and a custom search URL which will send the user back to the website once the barcode has been decoded into a URN (SKU).
I have discovered ZBar which is an excellent app, unfortunately it doesn't support custom URL callback and it's designed for the iPhone.
Another app pic2shop PRO seems to tick these boxes, but it's relatively expensive at £10.49 and the setup will require somewhere in the region of 200 installs.
I did a similar project using the free version of pic2shop . The thing is that the free version can read only these types of barcodes : UPC-A, UPC-E, EAN-13, EAN-8 , according to the documentation of the app.
Pic2shop is a free barcode scanner app available for iOS® and Android®. It reads UPC-A, UPC-E, EAN-13, EAN-8 and QR codes. The app also display comparison shopping results for UPC and EAN.
From my personal experience, I can say that it scans and decodes the barcode very fast and very accurate.
In my project the app is launched from a webpage, it works for both android and ios. In order to get it working you have to invoke the pic2shop app from a url and then set your callback address. You will find the decoded barcode data as a value to a parameter in the callback url. To help you more, you can get those values using this javascript function found here.
For example:
<input type=button OnClick="scan();" value="Scan Barcode">
<script>
function scan(){
window.location="pic2shop://scan?callback=http://yourwebsiteurl.com/index.html?barcode=ean"
}
</script>
As soon as the item is successfully scanned it will redirect you to the callback url with the actual barcode number as a value to a parameter. For example http://yourwebsiteurl.com/index.html?barcode=5123548745123. I already told you how to get the value of a url parameter with javascript.
PDF417.mobi Pro barcode scanner app supports that use case.
Note: I'm a developer on that project.
Basically, the app can be launched from any other app, including a web application, when url in the form: pdf417://scan?type=PDF417,UPCA&callback=myscheme://myaction is launched.
The app then scans the barcode, in multiple formats, (PDF417 and UPCA in this example), until the result is obtained.
Then, the app opens the URL myscheme://myaction. In your case, this can be your web service, http://www.somemyscanner.com/service.
Specifically, it will open the URL using format: http://www.somemyscanner.com/service?data=[data]&type=[type].
You can then use those parameters to implement your desired functionalities.
I tried the PDF417 app and it is EXTREMELY expensive (for an app - $28) and does not work. I bought it anyway because I am trying to solve the same issue and I can tell you it is not the solution for general barcode scanning.
It might work with pdf417 barcodes, but those are few and far between and I haven't been able to get it to work. I definately does not support any standard barcode formats. It also has no settings panel (in settings) and the tap target in the app that should be settings just take you to the company web site.
I am still testing other apps but haven't found any app that does what you ask, Red Laser used to but it no longer has that functionality.
Is there a way to detect if the user is running the AIR application under en_GB locale on Windows? Capabilities.language returns only "en" and Capabilities.languages[0] returns "en_US" :(
Unfortunately, no.
But it will be something soon (sorry, can't tell you more now)!
Check here: http://www.adobe.com/cfusion/event/index.cfm?event=detail&id=1489921
"Get the inside scoop on the new
mobile features in Flash Player 10.1,
as well as the new global error
handling, UI, globalization, and media
playback features."
Now the globalization features are out in Flash Player 10.1 you can use them. Check out the documentation for them here:
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/globalization/package-detail.html
and more info here:
http://www.adobe.com/devnet/flashplayer/articles/flash_globalization_package.html#articlecontentAdobe_numberedheader
You can easily get the default local as a string like so:
new StringTools(LocaleID.DEFAULT).actualLocaleIDName; // returns en-GB if region is United Kingdom on OSX