iOS10 Speech Recognition "Listening" sound effect - ios

I am doing live speech recognition with the new iOS10 framework. I use AVCaptureSession to get to audio.
I have a "listening" beep sound to notify the user he can begin talking. The best way to put that sound is at the 1st call to captureOutput(:didOutputSampleBuffer..), but if I try to play a sound after starting the session the sound just won't play. And no error is thrown.. it just silently fail to play...
What I tried:
Playing through a system sound (AudioServicesPlaySystemSound...())
Play an asset with AVPlayer
Also tried both above solutions async/sync on main queue
It seems like regardless of what I am doing, it is impossible to trigger playing any kind of audio after triggering the recognition (not sure if it's specifically the AVCaptureSession or the SFSpeechAudioBufferRecognitionRequest / SFSpeechRecognitionTask...)
Any ideas? Apple even recommends playing a "listening" sound effect (and do it themselves with Siri) but I couldn't find any reference/example showing how to actually do it... (their "SpeakToMe" example doesn't play sound)
I can play the sound before triggering the session, and it does work (when starting the session at the completion of playing the sound) but sometimes theres a lag in actually staring the recognition (mostly when using BT headphones and switching from a different AudioSession category - for which I do not have a completion event...) - because of that I need a way to play the sound when the recording actually starts, and not before it triggers and cross fingers it won't lag starting it...

Well, apparently there are a bunch of "rules" one must follow in order to successfully begin a speech recognition session and play a "listening" effect only when (after) the recognition really began.
The session setup & triggering must be called on main queue. So:
DispatchQueue.main.async {
speechRequest = SFSpeechAudioBufferRecognitionRequest()
task = recognizer.recognitionTask(with: speechRequest, delegate: self)
capture = AVCaptureSession()
//.....
shouldHandleRecordingBegan = true
capture?.startRunning()
}
The "listening" effect should be player via AVPlayer, not as a system sound.
The safest place to know we are definitely recording, is in the delegate call of AVCaptureAudioDataOutputSampleBufferDelegate, when we get our first sampleBuffer callback:
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
//only once per recognition session
if shouldHandleRecordingBegan {
shouldHandleRecordingBegan = false
player = AVPlayer(url: Bundle.main.url(forResource: "listening", withExtension: "aiff")!)
player.play()
DispatchQueue.main.async {
//call delegate/handler closure/post notification etc...
}
}
// append buffer to speech recognition
speechRequest?.appendAudioSampleBuffer(sampleBuffer)
}
End of recognition effect is hell of a lot easier:
var ended = false
if task?.state == .running || task?.state == .starting {
task?.finish() // or task?.cancel() to cancel and not get results.
ended = true
}
if true == capture?.isRunning {
capture?.stopRunning()
}
if ended {
player = AVPlayer(url: Bundle.main.url(forResource: "done", withExtension: "aiff")!)
player.play()
}

Related

AVAudioEngine recording microphone input seems to stop upon playing music

I’m recording microphone input to match it to a song in the Shazam catalog. This works, but if I start() the AVAudioEngine then something happens like music starts playing via MPMusicPlayerController.applicationMusicPlayer.play(), it seems the audio engine stops or gets interrupted. The microphone recording shuts off, thus the SHSessionDelegate never finds a match or fails with an error, so my UI is stuck showing it's listening when it’s not anymore. Is there a way to be informed when this happens so that I may update the UI to handle cancelation?
private lazy var shazamAudioEngine = AVAudioEngine()
private lazy var shazamSession: SHSession = {
let session = SHSession()
session.delegate = self
return session
}()
...
try? AVAudioSession.sharedInstance().setCategory(.record)
//Create an audio format for our buffers based on the format of the input, with a single channel (mono)
let audioFormat = AVAudioFormat(standardFormatWithSampleRate: shazamAudioEngine.inputNode.outputFormat(forBus: 0).sampleRate, channels: 1)
//Install a "tap" in the audio engine's input so that we can send buffers from the microphone to the session
shazamAudioEngine.inputNode.installTap(onBus: 0, bufferSize: 2048, format: audioFormat) { [weak self] buffer, when in
//Whenever a new buffer comes in, we send it over to the session for recognition
self?.shazamSession.matchStreamingBuffer(buffer, at: when)
}
do {
try shazamAudioEngine.start()
} catch {
...
}
In my testing isRunning tracks this state, so it changes from true to false when you start playing music and the microphone stops being recorded. Unfortunately that property can't be observed with KVO so what I did was set up a repeating Timer to detect if it changes to handle cancelation, making sure to invalidate() the timer when other state changes occur.
audioEngineRunningTimer = Timer.scheduledTimer(withTimeInterval: 0.5, repeats: true) { [weak self] timer in
//the audio engine stops running when you start playing music for example, so handle cancelation here
if self?.shazamAudioEngine.isRunning == false {
//...
}
}

How can i control my headset for my music player?

I am creating an music player iOS app and getting data from firebase. I can able to control play, pause, next and previous in simulator or iPhone. While headset is connect to device play, next and previous functionalities are not working properly.
Here is the code which i've used;
func setupRemoteCommandCenter() {
// Get the shared MPRemoteCommandCenter
let commandCenter = MPRemoteCommandCenter.shared()
// Add handler for Play Command
commandCenter.playCommand.addTarget { event in
player?.play()
print("headset play")
return .success
}
// Add handler for Pause Command
commandCenter.pauseCommand.addTarget { event in
player?.pause()
print("headset pause")
return .success
}
// Add handler for Next Command
commandCenter.nextTrackCommand.addTarget { event in
return .success
}
// Add handler for Previous Command
commandCenter.previousTrackCommand.addTarget { event in
return .success
}
}
And calling setupRemoteCommandCenter function in viewdidload
This document says that to receive player event notifications you need to
begin playing audio
be the "Now playing app"
The definition of "Now playing app" is hard to pin down, but it seems to be any app that has an active, non-mixable audio session and is playing audio (or has very recently played audio, there seems to be a brief grace period here) . One possible non-mixable audio session is:
let session = AVAudioSession.sharedInstance()
do {
try session.setCategory(AVAudioSessionCategoryPlayback)
try session.setActive(true)
} catch let err as NSError {
print("Error setting up non mixable audio session \(err)")
}
p.s. if you want the headset controls to work while the screen is locked (or if you want the lockscreen controls to work for that matter), you will need to add audio to Background Modes, because technically, if the screen is locked then your app has been backgrounded:

Keep AVAudioPlayer sound in the memory

I use AVAudioPlayer to play a click sound if the user taps on a button.
Because there is a delay between the tap and the sound, I play the sound once in viewDidAppear with volume = 0
I found that if the user taps on the button within a time period the sound plays immediately, but after a certain time there is a delay between the tap and the sound in this case also.
It seems like in the first case the sound comes from cache of the initial play, and in the second case the app has to load the sound again.
Therefore now I play the sound every 2 seconds with volume = 0 and when the user actually taps on the button the sound comes right away.
My question is there a better approach for this?
My goal would be to keep the sound in cache within the whole lifetime of the app.
Thank you,
To avoid audio lag, use the .prepareToPlay() method of AVAudioPlayer.
Apple's Documentation on Prepare To Play
Calling this method preloads buffers and acquires the audio hardware
needed for playback, which minimizes the lag between calling the
play() method and the start of sound output.
If player is declared as an AVAudioPlayer then player.prepareToPlay() can be called to avoid the audio lag. Example code:
struct AudioPlayerManager {
var player: AVAudioPlayer? = AVAudioPlayer()
mutating func setupPlayer(soundName: String, soundType: SoundType) {
if let soundURL = Bundle.main.url(forResource: soundName, withExtension: soundType.rawValue) {
do {
player = try AVAudioPlayer(contentsOf: soundURL)
player?.prepareToPlay()
}
catch {
print(error.localizedDescription)
}
} else {
print("Sound file was missing, name is misspelled or wrong case.")
}
}
Then play() can be called with minimal lag:
player?.play()
If you save the pointer to AVAudioPlayer then your sound remains in memory and no other lag will occur.
First delay is caused by sound loading, so your 1st playback in viewDidAppear is right.

Using Spotify/background music with camera open

I have an app that needs to have:
Background music playing while using the app (eg. spotify)
Background music playing while watching movie from AVPlayer
Stop the music when recording a video
Like Snapchat, the camera-viewcontroller is part of a "swipeview" and therefore always on.
However, when opening and closing the app, the music makes a short "crack" noise/sound that ruins the music.
I recorded it here:
https://soundcloud.com/morten-stulen/hacky-sound-ios
(3 occurrences)
I use these settings for changing the AVAudiosession in the appdelegate didFinishLaunchingWithOptions:
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord,withOptions:
[AVAudioSessionCategoryOptions.MixWithOthers,
AVAudioSessionCategoryOptions.DefaultToSpeaker])
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print("error")
}
I use the LLSimpleCamera control for video recording and I've set the session there to:
_session.automaticallyConfiguresApplicationAudioSession = NO;
It seems others have the same problem with other camera libraries as well:
https://github.com/rFlex/SCRecorder/issues/127
https://github.com/rFlex/SCRecorder/issues/224
This guy removed the audioDeviceInput, but I kinda need that for recording video.
https://github.com/omergul123/LLSimpleCamera/issues/48
I also tried with Apple's code "AvCam", and I still have the same issue. How does Snapchat do this?!
Any help would be greatly appreciated, and I'll gladly provide more info or code!
I do something similar to what you're wanting, but without the camera aspect, but I think this will do what you want. My app allows background audio that will mix with non-fullscreen video/audio. When the user plays an audio file or a full screen video file, I stop the background audio completely.
The reason I do SoloAmbient then Playback is because I allow my audio to be played in the background when the device is locked. Going SoloAmbient will stop all background music playing and then switching to Playback lets my audio play in the app as well as in the background.
This is why you see a call to a method that sets the lock screen information in the Unload method. In this case, it is nulling it out so that there is no lock screen info.
In AppDelegate.swift
//MARK: Audio Session Mixing
func allowBackgroundAudio()
{
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayback, withOptions: .MixWithOthers)
} catch {
NSLog("AVAudioSession SetCategory - Playback:MixWithOthers failed")
}
}
func preventBackgroundAudio()
{
do {
//Ask for Solo Ambient to prevent any background audio playing, then change to normal Playback so we can play while locked
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategorySoloAmbient)
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayback)
} catch {
NSLog("AVAudioSession SetCategory - SoloAmbient failed")
}
}
When I want to stop background audio, for example when playing an audio track that should be alone, I do the following:
In MyAudioPlayer.swift
func playUrl(url: NSURL?, backgroundImageUrl: NSURL?, title: String, subtitle: String)
{
ForgeHelper.appDelegate().preventBackgroundAudio()
if _mediaPlayer == nil {
self._mediaPlayer = MediaPlayer()
_mediaPlayer!.delegate = self
}
//... Code removed for brevity
}
And when I'm done with my media playing, I do this:
private func unloadMediaPlayer()
{
if _mediaPlayer != nil {
_mediaPlayer!.unload()
self._mediaPlayer = nil
}
_controlView.updateForProgress(0, duration: 0, animate: false)
ForgeHelper.appDelegate().allowBackgroundAudio()
setLockScreenInfo()
}
Hope this helps you out!

How to scrub audio with AVPlayer?

I'm using seekToTime for an AVPlayer. It works fine, however, I'd like to be able to hear audio as I scrub through the video, much like how Final Cut or other video editors work. Just looking for ideas or if I've missed something obvious.
The way to do this is to scrub a simultaneous AVplayer asynchronously alongside the video. I did it this way (in Swift 4):
// create the simultaneous player and feed it the same URL:
let videoPlayer2 = AVPlayer(url: sameUrlAsVideoPlayer!)
videoPlayer2.volume = 5.0
//set the simultaneous player at exactly the same point as the video player.
videoPlayer2.seek(to: sameSeekTimeAsVideoPlayer)
// create a variable(letsScrub)that allows you to activate the audio scrubber when the video is being scrubbed:
var letsScrub: Bool?
//when you are scrubbing the video you will set letsScrub to true:
if letsScrub == true {audioScrub()}
//create this function to scrub audio asynchronously:
func audioScrub() {
DispatchQueue.main.async {
//set the variable to false so the scrubbing does not get interrupted:
self.letsScrub = false
//play the simultaneous player
self.videoPlayer2.play()
//make sure that it plays for at least 0.25 of a second before it stops to get that scrubbing effect:
DispatchQueue.main.asyncAfter(deadline: .now() + 0.25) {
//now that 1/4 of a second has passed (you can make it longer or shorter as you please) - pause the simultaneous player
self.videoPlayer2.pause()
//now move the simultaneous player to the same point as the original videoplayer:
self.videoPlayer2.seek(to: self.sameSeekTimeAsVideoPlayer)
//setting this variable back to true allows the process to be repeated as long as video is being scrubbed:
self.letsScrub = true
}
}
}

Resources