Does the library/sdk support a "hold to talk" mode? - google-assistant-sdk

My application requires that the user be allowed to record their utterance "offline", which is done while their holding down a button, and then I forward the utterance to the assistant. As opposed to just streaming the mic to the service.
Is this mode supported?

Yes, and almost out of the box. The library/sdk can work with audio file, so you just need to record your voice offline with a "hold to talk" code and send it to assistant just after.

Related

Agora Interactive Live Video Streaming - How to enable audio on broadcaster side?

https://docs.agora.io/en/Interactive%20Broadcast/start_live_ios?platform=iOS
I've followed above tutorial to implement interactive live video streaming. I've one broadcaster and multiple audience. Only broadcaster can broadcast and audience can only view broadcaster.
Broadcaster can't hear his own audio. Is there a way to enable audio on broadcaster side so that he can hear his own audio?
I've used code from above tutorial and set role to .broadcaster on broadcaster side and on audience side it is set to .audience.
Broadcaster
func setClientRole() {
// Set the client role as "host"
agoraKit?.setClientRole(.broadcaster)
}
Audience
func setClientRole() {
// Set the client role as "audience"
agoraKit?.setClientRole(.audience)
}
Generally with Video Streaming services the local user can not hear their own audio by design (look at YouTube Live, FB/Insta Live, etc). Otherwise it would cause echo or could possibly mute the audio if the echo cancelation. It is also very disorienting to a user to hear themselves so I would recommend against this.
In an effort to still answer your question and if it's imperative to your project to have that mic audio, I would recommend that you force the user to use headphones to avoid echo issues. This way you can use a custom audio source (full guide), where you initialize the mic and can send the audio to the headphones as well as pass it to the Agora SDK.
Since the implementation end of this could vary greatly depending on your project, I'll explain the basic concept.
With Agora you can enable the custom audio source using:
self.agoraKit.enableExternalAudioSource(withSampleRate: sampleRate, channelsPerFrame: channel)
When you join the channel you would initialize the mic yourself, and maintain that buffer. Then pass the custom audio to the
self.agoraKit.pushExternalAudioFrame(buffer, System.currentTimeMillis());
For more details I'd recommend taking a look at Agora's API Examples Project. You can use some of the Audio Controllers to see how the audio is handled.

How to Toggle video recording in Twilio

I am trying to add a Toggle recording feature into my web application,
Using Twilio Client version 2.0 and Generating token via Java Server side code.
Toggle(just to be clear)-> Being able to pause/start recording when a call is already underway.
Question : How to implement toggle recording feature in Twilio, If that's possible?
As of now, Pause/Start video recording on a video call is not possible.
Response from Twilio support:
Thanks for reaching out to us! At this time, pausing and resuming a recording isn't possible. This is a feature we're looking to add in the future, but I don't have a specific date to share on when that will be available. I'll add your ticket to the feature request so that we can accurately set a priority for implementing it. Please let me know if you have any questions!

Is there a way to change VoiceOver output channel?

Is there a way (swift3/objc) to redirect output of VoiceOver between [headphones, speaker, BT, ...] ?
Because the API documentation as far as I could see doesn't have any information on channel changing...
We would like the ability to record voice while the user is listening to VoiceOver feedback on headphones (or change that, if they like).
On iOS, all sound is routed to the default audio route to start, including VoiceOver. When a new audio route is connected (eg. Bluetooth headphones), two extra VoiceOver options become available. First, an "Destination" option appears in the VoiceOver rotor that lets users choose the system's audio route. Second, the Sound section of VoiceOver settings in Settings app includes the option to select separate "Speech Channels" and "Sound Channels". You can even limit Speech output to a single channel of the route, which would let you listen to music in one ear and VoiceOver output in the other.
On Mac, this is a user-configurable setting within VoiceOver Utility.
You cannot redirect VoiceOver output programmatically on either platform. This is probably due to the catastrophic user experience that could create. Imagine redirecting your visual output to an unplugged device in your closet. How would you recover control of your computer?
You can use overrideOutputAudioPort in AVAudioSession to redirect the VoiceOver. Assuming your mode is set to playAndRecord.

Voice Command without Pressing a Button on iOS

Currently, I am working on developing the iOS App that triggers an event upon voice command.
I saw a camera app, where a user says "start recording," then the camera starts to the recording mode.
This is an in-app voice control capability, so I am thinking it is different from SiriKit or SpeechRecognizer, which I have already implemented.
How would I achieve it?
My question is NOT the voice dictation where a user has to press a button to start dictation.
App needs to passively wait for a keyword, or intent, which is something like "myApp, start recording" or "myApp, stop recording", then the app starts/stop that event function accordingly.
Thanks.
OpenEars : Free speech recognition and speech synthesis for the iPhone.
OpenEars makes it simple for you to add offline speech recognition in many languages and synthesized speech/TTS to your iPhone app quickly and easily. It lets everyone get the great results of using advanced speech app interface concepts.
Check out this link.
http://www.politepix.com/openears/
or
Building an iOS App like Siri
https://www.raywenderlich.com/60870/building-ios-app-like-siri
Thank you.
How would I achieve it?
There's an iOS 13 new feature called Voice Control that will allow you to reach your goal.
You can find useful information in the Customize Commands section where all the vocal commands are available (you can create a custom one as well):
For the example of the camera you mentioned, everything can be done vocally as follows:
I showed the items names to understand the vocal commands I used but they can be hidden if you prefer (hide names).
Voice Control is a built-in feature you can use inside your apps as well.
The only thing to do as a developer is eventually adapting the accessibilityUserInputLabels properties if you need specific names to be displayed for some items in your apps.
If you're looking for a voice command without pressing a button on iOS, the Voice Control is THE perfect candidate.

How does Audiobus for iOS work?

What SDK's does Audiobus use to provide inter-app audio routing? I am not aware of any Apple SDK that could facilitate inter-app communication for iOS and was under the impression that apps were sandboxed from each other so I'm really intrigued to hear how they pulled this off.
iOS allows inter-app communication via MIDI Sysex messages. AudioBus works by sending audio as MIDI Sysex message. You can read details from the developer himself:
http://atastypixel.com/blog/thirteen-months-of-audiobus/
My guess is that they use some sort of audio over network, because I've seen log statements when our app gets started even on a different device.
Don't really know about the details of the implementation, but this could be a way of staying in the "sandbox" constraint.
The Audiobus SDK (probably) use the Audio Session rules to "organize" all the sound output from the apps using their SDK, as you can see on their videos (on bottom of the page), the apps have an lateral menu to switch back and forwards between apps.
The Audio Session Category states:
Allows mixing: if yes, audio from other applications (such as the iPod) can continue playing when your application plays sound.
This way Audiobus can "control" the sound and allow the session to be persistent between the apps.

Resources