Get frequency of a sound recorder with microphone - ios

I need to calculate the frequency, in Hertz, of a sound recorded with the microphone.
What I'm doing now is using AVAudioRecorder to listen to the mic, with a timer that call a specific function every 0.5 seconds. Here some code :
class ViewController: UIViewController {
var audioRecorder: AVAudioRecorder?
var timer: Timer?
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
let permission = AVAudioSession.sharedInstance().recordPermission
if permission == AVAudioSession.RecordPermission.undetermined {
AVAudioSession.sharedInstance().requestRecordPermission { (granted) in
if granted {
print("Permission granted!")
} else {
print("Permission not granted!")
}
}
} else if permission == AVAudioSession.RecordPermission.granted {
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.record)
let settings = [
AVSampleRateKey: 44100.0,
AVFormatIDKey: kAudioFormatAppleLossless,
AVNumberOfChannelsKey: 1,
AVEncoderAudioQualityKey: AVAudioQuality.max
] as [String : Any]
audioRecorder = try AVAudioRecorder.init(url: NSURL.fileURL(withPath: "dev/null"), settings: settings)
audioRecorder?.prepareToRecord()
audioRecorder?.isMeteringEnabled = true
audioRecorder?.record()
timer = Timer.scheduledTimer(
timeInterval: 0.5,
target: self,
selector: #selector(analyze),
userInfo: nil,
repeats: true
)
} catch (let error) {
print("Error! \(error.localizedDescription)")
}
}
}
#objc func analyze() {
audioRecorder?.updateMeters()
let peak = audioRecorder?.peakPower(forChannel: 0)
print("Peak : \(peak)")
audioRecorder?.updateMeters()
}
}
I do not know how to get the frequency of the sound in hertz. It's fine also to use a 3rd party framework for me.
Thanks.

Any given recorded sound will not have a single frequency. It will have a mix of frequencies at different amplitudes.
You need to do frequency analysis on the input sounds, usually using FFT (Fast Fourier Transforms) on the audio data.
A google search revealed this article on doing frequency analysis using the Accelerate framework:
http://www.myuiviews.com/2016/03/04/visualizing-audio-frequency-spectrum-on-ios-via-accelerate-vdsp-fast-fourier-transform.html

Related

Cannot listen to iOS app on Bluetooth if user has granted microphone access

I am developing an iOS 14 app that plays fragments of an audio file for the user to imitate. If the user wants to, the app can record the user's responses and play these back immediately. The user can also export an audio recording that comprises the original fragments, plus the user's responses.
I am using AudioKit 4.11
Because it is possible the user may never wish to take advantage of the app's recording abilities, the app initially adopts the audio session category of .playback. If the user wants to use the recording feature, the app triggers the standard Apple dialog for requesting microphone access, and if this is granted, switches the session category to .playAndRecord.
I have found that when the session category is .playback and the user has not yet granted microphone permission, I am able to listen to the app's output on a Bluetooth speaker, or on my Jabra Elite 65t Bluetooth earbuds when the app is running on a real iPhone. In the example below, this is the case when the app first runs and the user has only ever tapped "Play sound" or "Stop".
However, as soon as I tap "Play sound and record response" and grant microphone access, I am unable to listen to the app's output on a Bluetooth device, regardless of whether the session category applicable at the time is .playback (after tapping "Play sound and record response") or .playAndRecord (after tapping "Play sound") - unless I subsequently go to my phone's Privacy settings and toggle microphone access to off. Playback is available only through the phone's speaker, or through plugged in headphones.
When setting the session category of .playAndRecord I have tried invoking the .allowBluetoothA2DP option.
Apple's advice implies this should allow me to listen to my app's sound over Bluetooth in the circumstances I have described above (see https://developer.apple.com/documentation/avfoundation/avaudiosession/categoryoptions/1771735-allowbluetootha2dp). However I've not found this to be the case.
The code below represents a runnable app (albeit one requiring the presence of AudioKit 4.11) that illustrates the problem in a simplified form. The only elements not shown here are an NSMicrophoneUsageDescription that I added to info.plist, and the file "blues.mp3" which I imported into the project.
ContentView:
import SwiftUI
import AudioKit
import AVFoundation
struct ContentView: View {
private var pr = PlayerRecorder()
var body: some View {
VStack{
Text("Play sound").onTapGesture{
pr.setupforPlay()
pr.playSound()
}
.padding()
Text("Play sound and record response").onTapGesture{
if recordingIsAllowed() {
pr.activatePlayAndRecord()
pr.startSoundAndResponseRecording()
}
}
.padding()
Text("Stop").onTapGesture{
pr.stop()
}
.padding()
}
}
func recordingIsAllowed() -> Bool {
var retval = false
AVAudioSession.sharedInstance().requestRecordPermission { granted in
retval = granted
}
return retval
}
}
PlayerRecorder:
import Foundation
import AudioKit
class PlayerRecorder {
private var mic: AKMicrophone!
private var micBooster: AKBooster!
private var mixer: AKMixer!
private var outputBooster: AKBooster!
private var player: AKPlayer!
private var playerBooster: AKBooster!
private var recorder: AKNodeRecorder!
private var soundFile: AKAudioFile!
private var twentySecondTimer = Timer()
init() {
AKSettings.defaultToSpeaker = true
AKSettings.disableAudioSessionDeactivationOnStop = true
AKSettings.notificationsEnabled = true
}
func activatePlayAndRecord() {
do {
try AKManager.shutdown()
} catch {
print("Shutdown failed")
}
setupForPlayAndRecord()
}
func playSound() {
do {
soundFile = try AKAudioFile(readFileName: "blues.mp3")
} catch {
print("Failed to open sound file")
}
do {
try player.load(audioFile: soundFile!)
} catch {
print("Player failed to load sound file")
}
if micBooster != nil{
micBooster.gain = 0.0
}
player.play()
}
func setupforPlay() {
do {
try AKSettings.setSession(category: .playback)
} catch {
print("Failed to set session category to .playback")
}
mixer = AKMixer()
outputBooster = AKBooster(mixer)
player = AKPlayer()
playerBooster = AKBooster(player)
playerBooster >>> mixer
AKManager.output = outputBooster
if !AKManager.engine.isRunning {
try? AKManager.start()
}
}
func setupForPlayAndRecord() {
AKSettings.audioInputEnabled = true
do {
try AKSettings.setSession(category: .playAndRecord)
/* Have tried the following instead of the line above, but without success
let options: AVAudioSession.CategoryOptions = [.allowBluetoothA2DP]
try AKSettings.setSession(category: .playAndRecord, options: options.rawValue)
Have also tried:
try AKSettings.setSession(category: .multiRoute)
*/
} catch {
print("Failed to set session category to .playAndRecord")
}
mic = AKMicrophone()
micBooster = AKBooster(mic)
mixer = AKMixer()
outputBooster = AKBooster(mixer)
player = AKPlayer()
playerBooster = AKBooster(player)
mic >>> micBooster
micBooster >>> mixer
playerBooster >>> mixer
AKManager.output = outputBooster
micBooster.gain = 0.0
outputBooster.gain = 1.0
if !AKManager.engine.isRunning {
try? AKManager.start()
}
}
func startSoundAndResponseRecording() {
// Start player and recorder. After 20 seconds, call a function that stops the player
// (while allowing recording to continue until user taps Stop button).
activatePlayAndRecord()
playSound()
// Force removal of any tap not previously removed with stop() call for recorder
var mixerNode: AKNode?
mixerNode = mixer
for i in 0..<8 {
mixerNode?.avAudioUnitOrNode.removeTap(onBus: i)
}
do {
recorder = try? AKNodeRecorder(node: mixer)
try recorder.record()
} catch {
print("Failed to start recorder")
}
twentySecondTimer = Timer.scheduledTimer(timeInterval: 20.0, target: self, selector: #selector(stopPlayerOnly), userInfo: nil, repeats: false)
}
func stop(){
twentySecondTimer.invalidate()
if player.isPlaying {
player.stop()
}
if recorder != nil {
if recorder.isRecording {
recorder.stop()
}
}
if AKManager.engine.isRunning {
do {
try AKManager.stop()
} catch {
print("Error occurred while stopping engine.")
}
}
print("Stopped")
}
#objc func stopPlayerOnly () {
player.stop()
if !mic.isStarted {
mic.start()
}
if !micBooster.isStarted {
micBooster.start()
}
mic.volume = 1.0
micBooster.gain = 1.0
outputBooster.gain = 0.0
}
}
Three additional lines of code near the beginning of setupForPlayAndRecord() solve the problem:
func setupForPlayAndRecord() {
AKSettings.audioInputEnabled = true
// Adding the following three lines solves the problem
AKSettings.useBluetooth = true
let categoryOptions: AVAudioSession.CategoryOptions = [.allowBluetoothA2DP]
AKSettings.bluetoothOptions = categoryOptions
do {
try AKSettings.setSession(category: .playAndRecord)
} catch {
print("Failed to set session category to .playAndRecord")
}
mic = AKMicrophone()
micBooster = AKBooster(mic)
mixer = AKMixer()
outputBooster = AKBooster(mixer)
player = AKPlayer()
playerBooster = AKBooster(player)
mic >>> micBooster
micBooster >>> mixer
playerBooster >>> mixer
AKManager.output = outputBooster
micBooster.gain = 0.0
outputBooster.gain = 1.0
if !AKManager.engine.isRunning {
try? AKManager.start()
}
}

Why does Swift AVAudioPlayer not change state after MPRemoteCommandCenter play command?

The environment for this is iOS 13.6 and Swift 5. I have a very simple app that successfully plays an MP3 file in the foreground or background. I added MPRemoteCommandCenter play and pause command handlers to it. I play the sound file in the foreground and then pause it.
When I tap the play button from the lock screen, my code calls audioPlayer.play(), which returns true. I hear the sound start playing again, but the currentTime of the player does not advance. After that, the play and pause buttons on the lock screen do nothing. When I foreground the app again, the play button plays from where it was before I went to the lock screen.
Here is my AudioPlayer class:
import AVFoundation
import MediaPlayer
class AudioPlayer: RemoteAudioCommandDelegate {
var audioPlayer = AVAudioPlayer()
let remoteCommandHandler = RemoteAudioCommandHandler()
var timer:Timer!
func play(title: String) {
let path = Bundle.main.path(forResource: title, ofType: "mp3")!
let url = URL(fileURLWithPath: path)
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playback)
try AVAudioSession.sharedInstance().setActive(true)
audioPlayer = try AVAudioPlayer(contentsOf: url)
remoteCommandHandler.delegate = self
remoteCommandHandler.enableDisableRemoteCommands(true)
timer = Timer.scheduledTimer(timeInterval: 1.0, target: self, selector: #selector(updateNowPlayingInfo), userInfo: nil, repeats: true)
} catch let error as NSError {
print("error = \(error)")
}
}
func play() {
print ("audioPlayer.play() returned \(audioPlayer.play())")
}
func pause() {
audioPlayer.pause()
}
func stop() {
audioPlayer.stop()
}
func currentTime() -> TimeInterval {
return audioPlayer.currentTime
}
func setCurrentTime(_ time:TimeInterval) {
audioPlayer.currentTime = time
}
#objc func updateNowPlayingInfo() {
// Hard-code the nowPlayingInfo since this is a simple test app
var nowPlayingDict =
[MPMediaItemPropertyTitle: "Tin Man",
MPMediaItemPropertyAlbumTitle: "The Complete Greatest Hits",
MPMediaItemPropertyAlbumTrackNumber: NSNumber(value: UInt(10) as UInt),
MPMediaItemPropertyArtist: "America",
MPMediaItemPropertyPlaybackDuration: 208,
MPNowPlayingInfoPropertyPlaybackRate: NSNumber(value: 1.0 as Float)] as [String : Any]
nowPlayingDict[MPNowPlayingInfoPropertyElapsedPlaybackTime] = NSNumber(value: audioPlayer.currentTime as Double)
MPNowPlayingInfoCenter.default().nowPlayingInfo = nowPlayingDict
}
}
Here is my RemoteCommandHandler class:
import Foundation
import MediaPlayer
protocol RemoteAudioCommandDelegate: class {
func play()
func pause()
}
class RemoteAudioCommandHandler: NSObject {
weak var delegate: RemoteAudioCommandDelegate?
var remoteCommandCenter = MPRemoteCommandCenter.shared()
var playTarget: Any? = nil
var pauseTarget: Any? = nil
func enableDisableRemoteCommands(_ enabled: Bool) {
print("Called with enabled = \(enabled)")
remoteCommandCenter.playCommand.isEnabled = enabled
remoteCommandCenter.pauseCommand.isEnabled = enabled
if enabled {
addRemoteCommandHandlers()
} else {
removeRemoteCommandHandlers()
}
}
fileprivate func addRemoteCommandHandlers() {
print( "Entered")
if playTarget == nil {
print( "Adding playTarget")
playTarget = remoteCommandCenter.playCommand.addTarget { (event) -> MPRemoteCommandHandlerStatus in
print("addRemoteCommandHandlers calling delegate play")
self.delegate?.play()
return .success
}
}
if pauseTarget == nil {
print( "Adding pauseTarget")
pauseTarget = remoteCommandCenter.pauseCommand.addTarget { (event) -> MPRemoteCommandHandlerStatus in
print("addRemoteCommandHandlers calling delegate pause")
self.delegate?.pause()
return .success
}
}
}
fileprivate func removeRemoteCommandHandlers() {
print( "Entered")
if playTarget != nil {
print( "Removing playTarget")
remoteCommandCenter.playCommand.removeTarget(playTarget)
playTarget = nil
}
if pauseTarget != nil {
print( "Removing pauseTarget")
remoteCommandCenter.pauseCommand.removeTarget(pauseTarget)
pauseTarget = nil
}
}
}
I will gladly supply further required info, because I'm baffled at why this relatively straightforward code (in my mind) code doesn't work.
Assistance is much appreciated!
After some more debugging, I found that the AVAudioPlayer started to play the sound from the lock screen, but stopped again shortly after.
I mitigated the problem by adding a Timer. The timer checks if the last command by the user was play, but the sound is not playing. I also change the status when the user selects pause or the song stops playing at its natural end.
I am still at a loss for an actual fix for this problem.

How to use a bluetooth microphone and a built-in speaker at the same time?

I'm trying to achieve a very simple task in iOS. I would like to get audio from a Bluetooth microphone and play the audio simultaneously using the built-in speaker.
Here is what I have right now. The app allows users to choose a speaker and a microphone source. However, whenever I pick a built-in speaker, the app will switch to a built-in microphone. And whenever I pick a Bluetooth microphone, the app will use the Bluetooth speaker. There is no way I can use the Bluetooth microphone and the built-in speaker separately.
import UIKit
import AVFoundation
import AVKit
class ViewController: UIViewController {
var engine = AVAudioEngine()
let player = AVAudioPlayerNode()
let audioSession = AVAudioSession()
var routePickerView = AVRoutePickerView(frame: CGRect(x: 0, y: 0, width: 0, height: 50))
let bus = 0
var isRunning = false
#IBOutlet weak var viewHolder: UIStackView!
override func viewDidLoad() {
super.viewDidLoad()
setUpAVRoutePicker()
setUpAVSession()
}
func setUpAVRoutePicker() {
viewHolder.addArrangedSubview(routePickerView)
}
func setUpAVSession() {
do {
try audioSession.setCategory(AVAudioSession.Category.playAndRecord, options: [.defaultToSpeaker, .allowBluetooth, .allowBluetoothA2DP, .allowAirPlay])
try audioSession.setMode(AVAudioSession.Mode.default)
try audioSession.overrideOutputAudioPort(AVAudioSession.PortOverride.speaker)
try audioSession.setActive(true)
} catch {
print("Error setting up AV session!")
print(error)
}
}
#IBAction func start(_ sender: AnyObject) {
isRunning = !isRunning
sender.setTitle(isRunning ? "Stop" : "Start", for: .normal)
if isRunning {
engine.attach(player)
let inputFormat = engine.inputNode.inputFormat(forBus: bus)
engine.connect(player, to: engine.mainMixerNode, format: inputFormat)
engine.inputNode.installTap(onBus: bus, bufferSize: 512, format: inputFormat) { (buffer, time) -> Void in
self.player.scheduleBuffer(buffer)
}
do {
try engine.start()
} catch {
print("Engine start error")
print(error)
return
}
player.play()
} else {
engine.inputNode.removeTap(onBus: bus)
engine.stop()
player.stop()
}
}
#IBAction func onInputBtnClicked(_ sender: AnyObject) {
let controller = UIAlertController(title: "Select Input", message: "", preferredStyle: UIAlertController.Style.actionSheet)
for input in audioSession.availableInputs ?? [] {
controller.addAction(UIAlertAction(title: input.portName, style: UIAlertAction.Style.default, handler: { action in
do {
try self.audioSession.setPreferredInput(input)
} catch {
print("Setting preferred input error")
print(error)
}
}))
}
present(controller, animated: true, completion: nil)
}
}
It seems like it's impossible to achieve that (https://stackoverflow.com/a/24519938/6308776), which is kinda crazy. I know that it's possible to achieve such a simple task on Android, but it looks like it's not possible for iOS. Does anyone have any ideas?
p/s: I would like to use a Bluetooth headset, and I don't want to have the transmission via wifi.
From Apple:
https://forums.developer.apple.com/thread/62954
... if an application is using setPreferredInput to select a Bluetooth
HFP input, the output should automatically be changed to the Bluetooth
HFP output corresponding with that input. Moreover, selecting a
Bluetooth HFP output using the MPVolumeView's route picker should
automatically change the input to the Bluetooth HFP input
corresponding with that output. In other words, both the input and
output should always end up on the same Bluetooth HFP device chosen
for either input/output even though only the input or output was set
individually. This is the intended behavior, but if it's not happening
we definitely want to know about it.

Custom Audio Recorder Framework Swift 3.0

I am creating an audio recording framework, it gets compiled correctly. When I use this framework in a project, the recording file gets created in documents folder as specified in framework but it's size remains at 4KB and does not increase and no audio is there in the file. I have given 30 seconds duration for recording.I have used AVFoundation for audio recording and the same code works if i use it directly in my project but invoking the code through custom created framework doesn't work.
public func startRecording() {
do {
if (recordingSession != nil) {
return
}
recordingSession = AVAudioSession.sharedInstance()
try recordingSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try recordingSession.setActive(true)
recordingSession.requestRecordPermission() { allowed in
DispatchQueue.main.async {
if allowed {
print("allowed")
let audioFilename = self.getDocumentsDirectory().appendingPathComponent("recording.caf")
print(audioFilename)
let settings = [
AVFormatIDKey: Int(kAudioFormatAppleIMA4),
AVSampleRateKey: 16000,
AVNumberOfChannelsKey: 1,
AVLinearPCMBitDepthKey: 16,
AVLinearPCMIsBigEndianKey: 0,
AVLinearPCMIsFloatKey: 0
]
do {
self.audioRecorder = try AVAudioRecorder(url: audioFilename, settings: settings)
self.audioRecorder.delegate = self
self.audioRecorder.isMeteringEnabled = true
let audioHWAvailable = self.recordingSession.isInputAvailable
if !audioHWAvailable
{
print("no audio input available")
return
}
UIApplication.shared.beginReceivingRemoteControlEvents()
self.audioRecorder.record(forDuration: 30)
if self.audioRecorder.prepareToRecord()
{
print("Recording started")
self.audioRecorder.record()
}
} catch {
self.finishRecording()
}
} else {
print("failed to record!")
}
}
}
} catch {
print("failed to record!")
}
}
I have this startRecording method in framework class which I call from my project.
EDIT: When I add timer after self.audioRecorder.record() line, the recording works but I don't understand the reason.
Finally I found the solution by taking help from Apple Support team.
By making the call to startRecording() through a global variable of AudioRecording Class, it worked without adding timer.

How To Make iOS Speech-To-Text Persistent

I am conducting initial research on a new potential product. Part of this product requires Speech-To-Text on both iPhones and iPads to remain on until the user turns it off. Upon using it myself, I noticed that it either automatically shuts off after 30 or so seconds, regardless of whether or not the user has stopped speaking, OR it shuts off after there have been a certain amount of questionable words from the speaker. In any case, this product requires it to remain on all of the time until explicitly told to stop. Has anybody worked with this before? And yes, I have tried a good search, I couldn't seem to find anything of substance, and especially anything written in the right language. Thanks friends!
import Speech
let recognizer = SFSpeechRecognizer()
let request = SFSpeechURLRecognitionRequest(url: audioFileURL)
#if targetEnvironment(simulator)
request.requiresOnDeviceRecognition = /* only appears to work on device; not simulator */ false
#else
request.requiresOnDeviceRecognition = /* only appears to work on device; not simulator */ true
#endif
recognizer?.recognitionTask(with: request, resultHandler: { (result, error) in
print (result?.bestTranscription.formattedString)
})
The above code snippet, when run on a physical device will continuously ("persistently") transcribe audio using Apple's Speech Framework.
The magic line here is request.requiresOnDeviceRecognition = ...
If request.requiresOnDeviceRecognition is true and SFSpeechRecognizer#supportsOnDeviceRecognition is true, then the audio will continuously be transcribed until battery dies, user cancels transcription, or some other error/terminating condition occurs. This is at least true in my trials.
Docs:
https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio
I found here a tutorial that show your speech. But see the notes:
Apple limits recognition per device. The limit is not known, but you can contact Apple for more information.
Apple limits recognition per app.
If you routinely hit limits, make sure to contact Apple, they can probably resolve it.
Speech recognition uses a lot of power and data.
Speech recognition only lasts about a minute at a time.
EDIT
This answer was for iOS 10. I expect the release of iOS 12 at October 2018 but Apple still says:
Plan for a one-minute limit on audio duration. Speech recognition can
place a relatively high burden on battery life and network usage. In
iOS 10, utterance audio duration is limited to about one minute, which
is similar to the limit for keyboard-related dictation.
See: https://developer.apple.com/documentation/speech
There are no API changes in the Speech Framework for iOS 11 and 12. See all API changes and especially for iOS 12 in detail by Paul Hudson: iOS 12 APIs Diffs
So my answer should still be valid.
///
/// Code lightly adopted by 🫐🥢 from https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio?language=swift
///
/// Modifications from original:
/// - Color of text changes every time a new "chunk" of text is transcribed
/// -- This was a feature I added while playing with my nephews. They loved it (2 and 6) (we kept saying rainbow)
/// - I added a bit of logic to scroll to the end of the text once new chunks were added
/// - I formatted the code using swiftformat
///
import Speech
import UIKit
public class ViewController: UIViewController, SFSpeechRecognizerDelegate {
private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
#IBOutlet var textView: UITextView!
#IBOutlet var recordButton: UIButton!
let colors: [UIColor] = [.red, .orange, .yellow, .green, .blue, .purple]
var colorIndex = 0
override public func viewDidLoad() {
super.viewDidLoad()
textView.textColor = colors[colorIndex]
// Disable the record buttons until authorization has been granted.
recordButton.isEnabled = false
}
override public func viewDidAppear(_ animated: Bool) {
super.viewDidAppear(animated)
// Configure the SFSpeechRecognizer object already
// stored in a local member variable.
speechRecognizer.delegate = self
// Asynchronously make the authorization request.
SFSpeechRecognizer.requestAuthorization { authStatus in
// Divert to the app's main thread so that the UI
// can be updated.
OperationQueue.main.addOperation {
switch authStatus {
case .authorized:
self.recordButton.isEnabled = true
case .denied:
self.recordButton.isEnabled = false
self.recordButton.setTitle("User denied access to speech recognition", for: .disabled)
case .restricted:
self.recordButton.isEnabled = false
self.recordButton.setTitle("Speech recognition restricted on this device", for: .disabled)
case .notDetermined:
self.recordButton.isEnabled = false
self.recordButton.setTitle("Speech recognition not yet authorized", for: .disabled)
default:
self.recordButton.isEnabled = false
}
}
}
}
private func startRecording() throws {
// Cancel the previous task if it's running.
recognitionTask?.cancel()
recognitionTask = nil
// Configure the audio session for the app.
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
let inputNode = audioEngine.inputNode
// Create and configure the speech recognition request.
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
/// The below lines are responsible for keeping the recording active longer
/// than just short bursts. I've had the recording going all day in somewhat
/// rudimentary attempts.
////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
if #available(iOS 13, *) {
let supportsOnDeviceRecognition = speechRecognizer.supportsOnDeviceRecognition
if !supportsOnDeviceRecognition {
fatalError("On device transcription not supported on this device. It is safe to remove this error but I wanted to add it as a warning that you'd actually see.")
}
recognitionRequest!.requiresOnDeviceRecognition = /* only appears to work on device; not simulator */ supportsOnDeviceRecognition
}
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a SFSpeechAudioBufferRecognitionRequest object") }
recognitionRequest.shouldReportPartialResults = true
// Create a recognition task for the speech recognition session.
// Keep a reference to the task so that it can be canceled.
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
var isFinal = false
if let result = result {
// Update the text view with the results.
self.colorIndex = (self.colorIndex + 1) % self.colors.count
self.textView.text = result.bestTranscription.formattedString
self.textView.textColor = self.colors[self.colorIndex]
self.textView.scrollRangeToVisible(NSMakeRange(result.bestTranscription.formattedString.count - 1, 0))
isFinal = result.isFinal
print("Text \(result.bestTranscription.formattedString)")
}
if error != nil || isFinal {
// Stop recognizing speech if there is a problem.
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.recordButton.isEnabled = true
self.recordButton.setTitle("Start Recording", for: [])
}
}
// Configure the microphone input.
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, _: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
// Let the user know to start talking.
textView.text = "(Go ahead, I'm listening)"
}
// MARK: SFSpeechRecognizerDelegate
public func speechRecognizer(_: SFSpeechRecognizer, availabilityDidChange available: Bool) {
if available {
recordButton.isEnabled = true
recordButton.setTitle("Start Recording", for: [])
} else {
recordButton.isEnabled = false
recordButton.setTitle("Recognition Not Available", for: .disabled)
}
}
// MARK: Interface Builder actions
#IBAction func recordButtonTapped() {
if audioEngine.isRunning {
audioEngine.stop()
recognitionRequest?.endAudio()
recordButton.isEnabled = false
recordButton.setTitle("Stopping", for: .disabled)
} else {
do {
try startRecording()
recordButton.setTitle("Stop Recording", for: [])
} catch {
recordButton.setTitle("Recording Not Available", for: [])
}
}
}
}
The above code snippet, when run on a physical device will continuously ("persistently") transcribe audio using Apple's Speech Framework.
The magic line here is request.requiresOnDeviceRecognition = ...
If request.requiresOnDeviceRecognition is true and SFSpeechRecognizer#supportsOnDeviceRecognition is true, then the audio will continuously be transcribed until battery dies, user cancels transcription, or some other error/terminating condition occurs. This is at least true in my trials.
Docs:
https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio
Notes:
I had originally attempted editing this answer [0] but wanted to add so much detail that I felt it completely hijacked the original answerer. I will be maintaining my own answer with ideally: An approach that translates this one into SwiftUI and also into the Composable Architecture (adopting their example [1]) as a canonical source of quick start for voice transcription on Apple Platforms.
0: https://stackoverflow.com/a/38729106/2441420
1: https://github.com/pointfreeco/swift-composable-architecture/tree/main/Examples/SpeechRecognition/SpeechRecognition
this will help you in autostart recording every 40 seconds even if you don't speak anything. If you speak something and then there is silence for 2 secs it will stop and didfinishtalk function is called.
#objc func startRecording() {
self.fullsTring = ""
audioEngine.reset()
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(.record)
try audioSession.setMode(.measurement)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
try audioSession.setPreferredSampleRate(44100.0)
if audioSession.isInputGainSettable {
let error : NSErrorPointer = nil
let success = try? audioSession.setInputGain(1.0)
guard success != nil else {
print ("audio error")
return
}
if (success != nil) {
print("\(String(describing: error))")
}
}
else {
print("Cannot set input gain")
}
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let inputNode = audioEngine.inputNode
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
self.timer4 = Timer.scheduledTimer(timeInterval: TimeInterval(40), target: self, selector: #selector(againStartRec), userInfo: nil, repeats: false)
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest, resultHandler: { (result, error ) in
var isFinal = false //8
if result != nil {
self.timer.invalidate()
self.timer = Timer.scheduledTimer(timeInterval: TimeInterval(2.0), target: self, selector: #selector(self.didFinishTalk), userInfo: nil, repeats: false)
let bestString = result?.bestTranscription.formattedString
self.fullsTring = bestString!
self.inputContainerView.inputTextField.text = result?.bestTranscription.formattedString
isFinal = result!.isFinal
}
if error == nil{
}
if isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
isFinal = false
}
if error != nil{
URLCache.shared.removeAllCachedResponses()
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
guard let task = self.recognitionTask else {
return
}
task.cancel()
task.finish()
}
})
audioEngine.reset()
inputNode.removeTap(onBus: 0)
let recordingFormat = AVAudioFormat(standardFormatWithSampleRate: 44100, channels: 1)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
self.hasrecorded = true
}
#objc func againStartRec(){
self.inputContainerView.uploadImageView.setBackgroundImage( #imageLiteral(resourceName: "microphone") , for: .normal)
self.inputContainerView.uploadImageView.alpha = 1.0
self.timer4.invalidate()
timer.invalidate()
self.timer.invalidate()
if ((self.audioEngine.isRunning)){
self.audioEngine.stop()
self.recognitionRequest?.endAudio()
self.recognitionTask?.finish()
}
self.timer2 = Timer.scheduledTimer(timeInterval: 2, target: self, selector: #selector(startRecording), userInfo: nil, repeats: false)
}
#objc func didFinishTalk(){
if self.fullsTring != ""{
self.timer4.invalidate()
self.timer.invalidate()
self.timer2.invalidate()
if ((self.audioEngine.isRunning)){
self.audioEngine.stop()
guard let task = self.recognitionTask else {
return
}
task.cancel()
task.finish()
}
}
}

Resources