I want to know if there's a way to use iOS speech recognition in offline mode. According to the documentation (https://developer.apple.com/reference/speech) I didn't see anything about it.
I am afraid that there is no way to do it (however, please make sure to check the update at the end of the answer).
As mentioned at the Speech Framework Official Documentation:
Best Practices for a Great User Experience:
Be prepared to handle the failures that can be caused by reaching speech recognition limits.
Because speech recognition is a network-based service, limits are
enforced so that the service can remain freely available to all apps.
As an end user perspective, trying to get Siri's help without connecting to a network should displays a screen similar to:
Also, When trying to send a massage -for example-, you'll notice that the mike button should be disabled if the device is unconnected to a network.
Natively, the iOS itself won't able this feature until checking network connection, I assume that would be the same for the third-party developer when using the Speech Framework.
UPDATE:
After watching Speech Recognition API Session (especially, the part 03:00 - 03:25) , I came up with:
Speech Recognition API usually requires an internet connection, but there are some of new devices do support this feature all the time; You might want to check whether the given language is available or not.
Adapted from SFSpeechRecognizer Documentation:
Note that a supported speech recognizer is not the same as an
available speech recognizer; for example, the recognizers for some
locales may require an Internet connection. You can use the
supportedLocales() method to get a list of supported locales and the
isAvailable property to find out if the recognizer for a specific
locale is available.
Further Reading:
These topics might be related:
Which iOS devices support offline speech recognition?
How to Enable Offline Dictation on Your iPhone?
Will Siri ever work offline?
Offline transcription will be available starting in iOS 13. You enable it with requiresOnDeviceRecognition.
Example code (Swift 5):
// Create and configure the speech recognition request.
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a SFSpeechAudioBufferRecognitionRequest object") }
recognitionRequest.shouldReportPartialResults = true
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = true
}
Related
I'm making an app which will be using Speech Recognition and want to know how frequently or when will my app encounter this scenario
I know that this related to device restricting Speech recognition rather than user but when exactly ??
is it due to some specific models not supporting speech recognition or is iOS version specific
or are there some settings that can restrict apps from using Speech recognition
Though no longer quite accurate, think of a restriction as a parental control that blocks a user from even having the option to enable a service controlled by device privacy settings.
https://support.apple.com/en-ca/HT201304
This falls under "Here are the things you can restrict:"
Speech Recognition: Prevent apps from accessing Speech Recognition or
Dictation
How often will you encounter it? Who knows, but if your app targets minors, then that is likely an increased chance, but this is purely speculative.
To answer your other question:
...is it due to some specific models not supporting speech
recognition...
There is a different way to test for speech support on a device:
https://developer.apple.com/documentation/speech/sfspeechrecognizer/1649885-isavailable
Using isAvailable (for Swift) or available (Obj-C), you can tell if the speech recognizer is available.
Since you marked your question as Objective-C, then the following would work:
SFSpeechRecognizer *recognizer = [[SFSpeechRecognizer alloc] init];
if (recognizer.available) {
// Do recognizer things
}
The same in Swift:
let recognizer = SFSpeechRecognizer()
if recognizer.isAvailable { }
I am working on an app that uses the new Speech framework in ios 10 to do some speech-to-text stuff. What is the best way of stopping the recognition when the user stops talking?
Not the best but a possible solution is to track the elpsed time since last result and after a certain amount of time stop recognition.
I am creating an ios application using webRTC for video conferencing. I want to detect who is talking in the peer connection.
To be more specific,I want to detect the audio activity of the remote peer I am connected to so that I can detect the person who is currently speaking.
This can be implemented by measuring the value of "audioOutputLevel" in peer-connection stats reports. The Function that will you should study is
- (void)peerConnection:(RTCPeerConnection*)peerConnection didGetStats:(NSArray*)stats
Check out this guide for building a sample WebRTC iOS application.
Check the section WebRTC Stats reporting
For example, audioSendInputLevel property indicates mic input level even while audio track disabled, so you can check if user is currently speaking/talking.
I'd like my iOS app to use text-to-speech to read to the user some information that it receives from a server, and I'd also like to allow the user to stop such speech by a voice command. I have tried speech recognition frameworks for iOS like OpenEars and I find the problem that it is listening and detecting the information the app itself is "saying" and it intereferes in the recognition of user's voice commands.
Has somebody dealt with this scenario in iOS and found a solution for that? Thanks in advance
It is not a trivial thing to implement. Unfortunately iOS and others record the sound which is playing through speaker. The only choice you have is to use the headset. In that case speech recognition can continue listening for input. In Openears recognition is disabled during TTS unless headset is plugged in.
If you still want to implement this feature which is called "barge-in" you have to do the following:
Store the audio you play though microphone
Implement noise cancellation algorithm which effectively will remove the audio from the recording. You can use cross-correlation to find a proper offset in the recording and spectral subtraction to remove the audio.
Recognize the speech in remaining signal.
It is not possible to do that without significant modification of openears sources.
Related question is Android Speech Recognition while music is playing
I have made an app that uses Openears framework to readout some text. But I haven't used any of Openears' speech recognition/speech synthesis features, just the talk to speech feature. My app got rejected by apple telling that the app asks for permission to use microphone while the app doesn't have any features of that kind. The following is the rejection message from apple:
During review we were prompted to provide consent to use the microphone, however, we were not able to find any features or functionality that use the microphone for audio recording.
The microphone consent request is generated by the use of either AVAudioSessionCategoryRecord or AVAudioSessionCategoryPlayAndRecord audio categories.
If you do not intend to record audio with your application, it would be appropriate to choose the AVAudioSession session category that fits your application's needs or modify your app to include audio-recording features.
For more information, please refer to the Security section of the iOS SDK Release Notes for iOS 7 GM Seed.
I have searched the app for AVAudioSessionCategoryRecord or AVAudioSessionCategoryPlayAndRecord audio categories as mentioned in the message but couldn't find any. How can I disable the prompting for permission to use microphone?
Your application got rejected because you don't need the microphone feature, openears by default interface with the use of the microphone feature hence why the user permissions came up. These user permissions are not dismissible as apple increased the security features for users so that they can be in more control of what they want their applications to be able to do. If you have to use OpenEars audio management feature for speech recognition see Update 1 otherwise continue on for a different solution using Apples Siri's Speech Synthesizer on iOS 7.
In your case, if all you want to do is read out some text, then you can use iOS7 Speech Synthesizer, which is the same synthesizer used to create Siri's voice.
It's SO easy to setup and I am currently using it for one of my projects to interact with the user via voice. Here's a quick tutorial on how to get it all setup:
Speech synthesizer tutorial
UPDATE 1
After #halle's comment, I decided to update the post for those that have to use the OpenEars framework who will be using only the FliteController Text To Speech feature without any sort of OpenEars speech recognition.
You can set the FliteController property noAudioSessionOverrides to TRUE so that you ensure that OpenEars wont interface with the Audio recording stream and this will stop the Microphone permissions alert from popping up.
[self.fliteController setNoAudioSessionOverrides:TRUE]
UPDATE 2
Based on #Halle's comment, you no longer need to do update 1:
Just an update that starting with today's update 1.65, FliteController won't ever make audio session calls on its own, so there is no further rejection danger here and it isn't necessary to set noAudioSessionOverrides.
I'm sorry your app was rejected. To use TTS only without any of the audio session management related to speech recognition in OpenEars, set FliteController's property noAudioSessionOverrides to TRUE. This will result in no audio session changes/no use of the mic stream.
I'll see if I can make the documentation for this setting a bit more prominent for developers doing TTS with OpenEars' FliteController only.
For completeness' sake, the documentation on how to greatly reduce your app binary size when using OpenEars, since that was also an issue for you:
http://www.politepix.com/forums/topic/slimming-down-your-app/
http://www.politepix.com/openears/support/#Q_How_can_I_trim_down_the_size_of_the_final_binary_for_distribution
Edit: starting with today's version 1.65 of OpenEars and its plugins, if you just use FliteController there is no danger of rejection because the TTS classes no longer make any calls to the audio session by themselves. Thanks for the heads-up about this and, again, sorry you had a rejection due to this.