In one of my application I am using the Speech framework for converting user's voice into Text.
Basically I want my application hands-free with some commands It can operate.
So there is a limit of Apple that has only 1000 request per hour and SFSpeechRecognitionTask only last about 1 minute only.
I want SFSpeechRecognitionTask should make alive and keep recognise the voice.
So what is the best way we can do with the code. Is it too much battery gain If I will do restart SFSpeechRecognitionTask in every 1 min?
I have done code like below to start detecting voice and it's going to stop after 1 minute.
Please help me out if there will be a way to achieve it.
func startRecording() {
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
}
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
if self.speechTimer != nil
{
if (self.speechTimer?.isValid)!
{
self.speechTimer?.invalidate()
}
self.speechTimer = nil;
}
print(result?.bestTranscription.formattedString as Any)
self.speechTimer = Timer.scheduledTimer(withTimeInterval: 2.0, repeats: false, block: { (timer) in
print("Recognition task restart")
})
isFinal = (result?.isFinal)!
if isFinal {
print("Final String: \(result?.bestTranscription.formattedString ?? "No string")")
}
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
}
Related
I'm trying to write a speech recognizer that will have the ability to be cleared after some duration (like two seconds) without switching it off.
private func startRecording() throws {
recognitionTask?.cancel()
self.recognitionTask = nil
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
let inputNode = audioEngine.inputNode
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a SFSpeechAudioBufferRecognitionRequest object") }
recognitionRequest.shouldReportPartialResults = true
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = false
}
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
var isFinal = false
if let result = result {
self.textView.text = result.bestTranscription.formattedString
// MARK: here after 2 seconds my recognition should clear it's own result in string
isFinal = result.isFinal
print("Text \(result.bestTranscription.formattedString)")
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.recordButton.isEnabled = true
self.recordButton.setTitle("Start Recording", for: [])
}
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
textView.text = "(Go ahead, I'm listening)"
}
I can not rebuild speech recognition. It should be working constantly.
I tried to do the following, but afterwards my microphone turns off:
inputNode.removeTap(onBus: 0)
audioEngine.inputNode.removeTap(onBus: 0)
I'm trying to make user speak and get what user says right. I read like 20 different articles about speech recognition and almost all is same. It keeps listening to user for like 1 minute or more. I want it to stop recognation when user stop speaking. I want to catch a word/few words that user says. Is there something limiting the time that user speak?
My code block :
func recordAndRecognizeSpeech(){
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let node = audioEngine.inputNode
guard let request = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
//request.shouldReportPartialResults = true
// Setting requiresOnDeviceRecognition to false would use the Apple Cloud for speech recognition.
if speechRecognizer?.supportsOnDeviceRecognition ?? false{
request.requiresOnDeviceRecognition = true
}
guard let myRecognizer = SFSpeechRecognizer() else {
// A recognizer is not supported for the current locale
return
}
if !myRecognizer.isAvailable {
// A recognizer is not available now
return
}
recognitionTask = speechRecognizer?.recognitionTask(with: request, resultHandler: { result, error in
if let result = result {
DispatchQueue.main.async {
let bestString = result.bestTranscription.formattedString
print(bestString)
}
} else if let error = error {
print(error)
self.audioEngine.stop()
node.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.speakButton.isEnabled = true
}
})
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat){buffer,_ in
self.recognitionRequest!.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
return print(error)
}
}
You can check the power of the sound input, and if it reach a minimum value, start a timer (like 3 seconds), and stop after the timer fire.
var recorder: AVAudioRecorder?
recorder.updateMeters()
let dB = recorder.averagePower(forChannel: 0)
Video not playing after SpeechRecognizer. Not getting any error just stuck on AVPlayerViewController. I have stopped speechRecognizer also. then after I am trying to play video. The video perfectly plays before speechRecognizer.
Maybe that possible speechRecognizer is not stopping by this code. So, Maybe the problem is in stopRecording().
#IBAction func btnRecord(_ sender: Any) {
player.pause()
player.seek(to: CMTime.init(value: 0, timescale: player.currentTime().timescale))
if self.audioEngine.isRunning {
self.audioEngine.stop()
self.recognitionRequest?.endAudio()
}
else {
try! self.startRecording()
}
}
private func startRecording() throws {
// Cancel the previous task if it's running.
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(AVAudioSession.Category.record, mode: .default, options: [])
try audioSession.setMode(AVAudioSession.Mode.measurement)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let inputNode = audioEngine.inputNode
//else { fatalError("Audio engine has no input node") }
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to created a SFSpeechAudioBufferRecognitionRequest object") }
// Configure request so that results are returned before audio recording is finished
recognitionRequest.shouldReportPartialResults = true
// A recognition task represents a speech recognition session.
// We keep a reference to the task so that it can be cancelled.
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
var isFinal = false
if let result = result {
self.text = result.bestTranscription.formattedString
self.lblText.text = self.text
isFinal = result.isFinal
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
}
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
}
private func stopRecording() {
audioEngine.stop()
recognitionRequest?.endAudio()
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
}
#IBAction func btnDonePopup(_ sender: Any) {
self.stopRecording()
self.playVideo()
}
Please change audioSession.setCategory to default value:
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
do {
try audioSession.setCategory(.soloAmbient, mode: .measurement, options: [])
} catch { }
}
I am working on an app that uses the new Speech framework in ios 10 to do some speech-to-text stuff. What is the best way of stopping the recognition when the user stops talking?
private func startRecording() {
isRecording = true
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord, mode: AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
return
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
}
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
if let result = result {
if error != nil || result.isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
let questionText = result.bestTranscription.formattedString
isRecording = false
self.audioEngine.stop()
recognitionRequest.endAudio()
self.audioEngine.inputNode?.removeTap(onBus: 0)
}
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try! audioEngine.start()
}
I want this code to be called once user does not talk
private func stopRecording() {
isRecording = false
audioEngine.stop()
recognitionRequest?.endAudio()
audioEngine.inputNode?.removeTap(onBus: 0)
}
This is my first time using SFSpeechRecognizer in Swift and one piece of functionality isn't working. When I press the button audioButtonPressed, it seems to start recognition fine, and pressing it again stops it. When I try pressing it again to start recognition again, the recognition doesnt work and leaves me with a blank text view. How should I do this?
Here's my code
#IBAction func audioButtonPressed(_ sender: Any) {
if isRecording {
stopRecording()
delegate?.speechRecognitionComplete(query: query)
audioButton.backgroundColor = UIColor.red
isRecording = false
} else {
startRecording()
audioButton.backgroundColor = UIColor.green
isRecording = true
}
}
func stopRecording() {
audioEngine.stop()
audioEngine.inputNode?.removeTap(onBus: 0)
recognitionRequest = nil
recognitionTask = nil
}
func startRecording() {
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else {
return
}
recognitionRequest.shouldReportPartialResults = true
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.query = result?.bestTranscription.formattedString
self.audioTextField.text = self.query
isFinal = (result?.isFinal)!
}
if error != nil || isFinal {
self.stopRecording()
}
})
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("the audio session isn't configured correctly")
}
let recordingFormat = audioEngine.inputNode?.outputFormat(forBus: 0)
audioEngine.inputNode?.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, time) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
audioTextField.text = "How may I help you"
} catch {
print("audio engine failed to start")
}
}
When i first press audiobutton, start recording is called and it works perfectly, pressing it again stop recording is called and works fine, but then pressing again does not make the recognition start again.... ideas?
I think you are missing recognitionTask.cancel() before you dealloc task in stopRecording function.