the app I'm currently making in Swift will help blind people navigate the world using this one comprehensive solution. I am looking to make a generic function for the app that when called, will immediately start recording, listen for the user to say something, and once the user stops speaking, it will automatically stop recording, convert the recording to a string, and return it. This function should be usable more than once in a single view controller.
I have tried using the technique from this article and it didn't work: https://medium.com/ios-os-x-development/speech-recognition-with-swift-in-ios-10-50d5f4e59c48
The recorder will be collecting the name of a building or a room in a building, so it doesn't need to be recording for terribly long - even a set length of time of 5 seconds would work. I am hoping to use a framework like Speech or something with Siri, but I am not opposed to using an external framework like Watson if it works better. Please help!
There's a beautiful appcoda tutorial here, which fits this perfectly.
This is the code they used to update a text field with the speech results. It can't be too difficult to channel the text going in their text field into whatever variable/function you use to process the result.
//
// ViewController.swift
// Siri
//
// Created by Sahand Edrisian on 7/14/16.
// Copyright © 2016 Sahand Edrisian. All rights reserved.
//
import UIKit
import Speech
class ViewController: UIViewController, SFSpeechRecognizerDelegate {
#IBOutlet weak var textView: UITextView!
#IBOutlet weak var microphoneButton: UIButton!
private let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
override func viewDidLoad() {
super.viewDidLoad()
microphoneButton.isEnabled = false
speechRecognizer.delegate = self
SFSpeechRecognizer.requestAuthorization { (authStatus) in
var isButtonEnabled = false
switch authStatus {
case .authorized:
isButtonEnabled = true
case .denied:
isButtonEnabled = false
print("User denied access to speech recognition")
case .restricted:
isButtonEnabled = false
print("Speech recognition restricted on this device")
case .notDetermined:
isButtonEnabled = false
print("Speech recognition not yet authorized")
}
OperationQueue.main.addOperation() {
self.microphoneButton.isEnabled = isButtonEnabled
}
}
}
#IBAction func microphoneTapped(_ sender: AnyObject) {
if audioEngine.isRunning {
audioEngine.stop()
recognitionRequest?.endAudio()
microphoneButton.isEnabled = false
microphoneButton.setTitle("Start Recording", for: .normal)
} else {
startRecording()
microphoneButton.setTitle("Stop Recording", for: .normal)
}
}
func startRecording() {
if recognitionTask != nil { //1
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance() //2
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest() //3
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
} //4
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
} //5
recognitionRequest.shouldReportPartialResults = true //6
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in //7
var isFinal = false //8
if result != nil {
self.textView.text = result?.bestTranscription.formattedString //9
isFinal = (result?.isFinal)!
}
if error != nil || isFinal { //10
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.microphoneButton.isEnabled = true
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0) //11
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare() //12
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
textView.text = "Say something, I'm listening!"
}
func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
if available {
microphoneButton.isEnabled = true
} else {
microphoneButton.isEnabled = false
}
}
}
Related
We are creating an online book reading app in which we are initiating video call (group call:- for video call. we are using agora SDK) and at the join of call we start book reading and highlight words at other members' end also and recording/recognition text we are using SFSpeechRecognizer but whenever call kit start and video call start SFSpeechRecognizer start to record audio at others end it's getting failed always, can you please provide any solution to record audio during the video call.
//
// Speech.swift
// Edsoma
//
// Created by Kapil on 16/02/22.
//
import Foundation
import AVFoundation
import Speech
protocol SpeechRecognizerDelegate {
func didSpoke(speechRecognizer : SpeechRecognizer , word : String?)
}
class SpeechRecognizer: NSObject {
private let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US")) //1
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
var delegate : SpeechRecognizerDelegate?
static let shared = SpeechRecognizer()
var isOn = false
func setup(){
speechRecognizer?.delegate = self //3
SFSpeechRecognizer.requestAuthorization { (authStatus) in //4
var isButtonEnabled = false
switch authStatus { //5
case .authorized:
isButtonEnabled = true
case .denied:
isButtonEnabled = false
print("User denied access to speech recognition")
case .restricted:
isButtonEnabled = false
print("Speech recognition restricted on this device" )
case .notDetermined:
isButtonEnabled = false
print("Speech recognition not yet authorized")
#unknown default:
break;
}
OperationQueue.main.addOperation() {
// self.microphoneButton.isEnabled = isButtonEnabled
}
}
}
func transcribeAudio(url: URL) {
// create a new recognizer and point it at our audio
let recognizer = SFSpeechRecognizer()
let request = SFSpeechURLRecognitionRequest(url: url)
// start recognition!
recognizer?.recognitionTask(with: request) { [unowned self] (result, error) in
// abort if we didn't get any transcription back
guard let result = result else {
print("There was an error: \(error!)")
return
}
// if we got the final transcription back, print it
if result.isFinal {
// pull out the best transcription...
print(result.bestTranscription.formattedString)
}
}
}
func startRecording() {
isOn = true
let inputNode = audioEngine.inputNode
if recognitionTask != nil {
inputNode.removeTap(onBus: 0)
self.audioEngine.stop()
self.recognitionRequest = nil
self.recognitionTask = nil
DispatchQueue.main.asyncAfter(deadline: DispatchTime.now() + 1) {
self.startRecording()
}
return
debugPrint("****** recognitionTask != nil *************")
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSession.Category.multiRoute)
try audioSession.setMode(AVAudioSession.Mode.measurement)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
recognitionRequest.taskHint = .search
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.delegate?.didSpoke(speechRecognizer: self, word: result?.bestTranscription.formattedString)
debugPrint(result?.bestTranscription.formattedString)
isFinal = (result?.isFinal)!
}
if error != nil {
debugPrint("Speech Error ====>",error)
inputNode.removeTap(onBus: 0)
self.audioEngine.stop()
self.recognitionRequest = nil
self.recognitionTask = nil
if BookReadingSettings.isSTTEnable{
DispatchQueue.main.asyncAfter(deadline: DispatchTime.now() + 1) {
self.startRecording()
}
}
// self.microphoneButton.isEnabled = true
}
})
// let recordingFormat = AVAudioFormat.init(commonFormat: .pcmFormatFloat32, sampleRate: <#T##Double#>, interleaved: <#T##Bool#>, channelLayout: <#T##AVAudioChannelLayout#>)//inputNode.outputFormat(forBus: 0)
inputNode.removeTap(onBus: 0)
let sampleRate = AVAudioSession.sharedInstance().sampleRate
let recordingFormat = AVAudioFormat(standardFormatWithSampleRate: sampleRate, channels: 1)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
debugPrint("Say something, I'm listening!")
//textView.text = "Say something, I'm listening!"
}
/* func stopRecording(){
isOn = false
debugPrint("Recording stoped")
self.audioEngine.stop()
recognitionTask?.cancel()
let inputNode = audioEngine.inputNode
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
}*/
func stopRecording(){
isOn = false
debugPrint("Recording stoped")
let inputNode = audioEngine.inputNode
inputNode.removeTap(onBus: 0)
self.audioEngine.stop()
recognitionTask?.cancel()
self.recognitionRequest = nil
self.recognitionTask = nil
}
}
Currently i'm working on speech to text for iOS 10 feature app. Here blow the code successfully return speech to text on my app. I need to implement this on several time (several viewController) My Question is Can someone explain how to do this as reusable for my every viewControllers. is there any design pattern possible to do. Thanks in advance.
import UIKit
import Speech
class ViewController: UIViewController, SFSpeechRecognizerDelegate {
#IBOutlet weak var textView: UITextView!
#IBOutlet weak var microphoneButton: UIButton!
private let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
override func viewDidLoad() {
super.viewDidLoad()
microphoneButton.isEnabled = false
speechRecognizer.delegate = self
SFSpeechRecognizer.requestAuthorization { (authStatus) in
var isButtonEnabled = false
switch authStatus {
case .authorized:
isButtonEnabled = true
case .denied:
isButtonEnabled = false
print("User denied access to speech recognition")
case .restricted:
isButtonEnabled = false
print("Speech recognition restricted on this device")
case .notDetermined:
isButtonEnabled = false
print("Speech recognition not yet authorized")
}
OperationQueue.main.addOperation() {
self.microphoneButton.isEnabled = isButtonEnabled
}
}
}
#IBAction func microphoneTapped(_ sender: AnyObject) {
if audioEngine.isRunning {
audioEngine.stop()
recognitionRequest?.endAudio()
microphoneButton.isEnabled = false
microphoneButton.setTitle("Start Recording", for: .normal)
} else {
startRecording()
microphoneButton.setTitle("Stop Recording", for: .normal)
}
}
func startRecording() {
if recognitionTask != nil { //1
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance() //2
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest() //3
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
} //4
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
} //5
recognitionRequest.shouldReportPartialResults = true //6
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in //7
var isFinal = false //8
if result != nil {
self.textView.text = result?.bestTranscription.formattedString //9
isFinal = (result?.isFinal)!
}
if error != nil || isFinal { //10
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.microphoneButton.isEnabled = true
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0) //11
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare() //12
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
textView.text = "Say something, I'm listening!"
}
func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
if available {
microphoneButton.isEnabled = true
} else {
microphoneButton.isEnabled = false
}
}
}
Arguably, that kind of code doesn't really belong in a view controller.
However, to directly answer the question, it seems to me that there are three basic options, roughly in order of preference:
Encapsulate the functionality you need in a new class and include that as a property in each of the view controllers than you need it in. (This would also solve the "doesn't belong in a view controller" complaint.)
Add a category to UIViewController. This would make your new methods available to all view controllers. You wouldn't be able to edit viewDidLoad and you'd need a setup method (or similar).
Subclass UIViewController with your methods and have all your other view controllers inherit from that. However, this would mean you couldn't use other Apple classes like UITableViewController.
4. Edit UIViewController with swizzling. (Terrible idea. Don't do this.)
Basically I am learning ios speech recognition module following this tutorial:
https://medium.com/ios-os-x-development/speech-recognition-with-swift-in-ios-10-50d5f4e59c48
But when I test it on my iphone6, I always got this error:
Error Domain=kAFAssistantErrorDomain Code=216 "(null)"
I searched it on the internet, but find very rare info about this.
Here is my code:
//
// ViewController.swift
// speech_sample
//
// Created by Peizheng Ma on 6/22/17.
// Copyright © 2017 Peizheng Ma. All rights reserved.
//
import UIKit
import AVFoundation
import Speech
class ViewController: UIViewController, SFSpeechRecognizerDelegate {
//MARK: speech recognize variables
let audioEngine = AVAudioEngine()
let speechRecognizer: SFSpeechRecognizer? = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))
var request = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask: SFSpeechRecognitionTask?
var isRecording = false
override func viewDidLoad() {
// super.viewDidLoad()
// get Authorization
self.requestSpeechAuthorization()
}
override func didReceiveMemoryWarning() {
super.didReceiveMemoryWarning()
// Dispose of any resources that can be recreated.
}
//MARK: properties
#IBOutlet weak var detectText: UILabel!
#IBOutlet weak var startButton: UIButton!
//MARK: actions
#IBAction func startButtonTapped(_ sender: UIButton) {
if isRecording == true {
audioEngine.stop()
// if let node = audioEngine.inputNode {
// node.removeTap(onBus: 0)
// }
audioEngine.inputNode?.removeTap(onBus: 0)
// Indicate that the audio source is finished and no more audio will be appended
self.request.endAudio()
// Cancel the previous task if it's running
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
//recognitionTask?.cancel()
//self.recognitionTask = nil
isRecording = false
startButton.backgroundColor = UIColor.gray
} else {
self.recordAndRecognizeSpeech()
isRecording = true
startButton.backgroundColor = UIColor.red
}
}
//MARK: show alert
func showAlert(title: String, message: String, handler: ((UIAlertAction) -> Swift.Void)? = nil) {
DispatchQueue.main.async { [unowned self] in
let alertController = UIAlertController(title: title, message: message, preferredStyle: .alert)
alertController.addAction(UIAlertAction(title: "OK", style: .cancel, handler: handler))
self.present(alertController, animated: true, completion: nil)
}
}
//MARK: Recognize Speech
func recordAndRecognizeSpeech() {
// Setup Audio Session
guard let node = audioEngine.inputNode else { return }
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { buffer, _ in
self.request.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
self.showAlert(title: "SpeechNote", message: "There has been an audio engine error.", handler: nil)
return print(error)
}
guard let myRecognizer = SFSpeechRecognizer() else {
self.showAlert(title: "SpeechNote", message: "Speech recognition is not supported for your current locale.", handler: nil)
return
}
if !myRecognizer.isAvailable {
self.showAlert(title: "SpeechNote", message: "Speech recognition is not currently available. Check back at a later time.", handler: nil)
// Recognizer is not available right now
return
}
recognitionTask = speechRecognizer?.recognitionTask(with: request, resultHandler: { result, error in
if let result = result {
let bestString = result.bestTranscription.formattedString
self.detectText.text = bestString
// var lastString: String = ""
// for segment in result.bestTranscription.segments {
// let indexTo = bestString.index(bestString.startIndex, offsetBy: segment.substringRange.location)
// lastString = bestString.substring(from: indexTo)
// }
// self.checkForColorsSaid(resultString: lastString)
} else if let error = error {
self.showAlert(title: "SpeechNote", message: "There has been a speech recognition error.", handler: nil)
print(error)
}
})
}
//MARK: - Check Authorization Status
func requestSpeechAuthorization() {
SFSpeechRecognizer.requestAuthorization { authStatus in
OperationQueue.main.addOperation {
switch authStatus {
case .authorized:
self.startButton.isEnabled = true
case .denied:
self.startButton.isEnabled = false
self.detectText.text = "User denied access to speech recognition"
case .restricted:
self.startButton.isEnabled = false
self.detectText.text = "Speech recognition restricted on this device"
case .notDetermined:
self.startButton.isEnabled = false
self.detectText.text = "Speech recognition not yet authorized"
}
}
}
}
}
Thank you very much.
I had the same problem whilst following the same (excellent) tutorial, even when using the example code on GitHub. To solve it, I had to do two things:
Firstly, add request.endAudio() at the start of the code to stop recording in the startButtonTapped action. This marks the end of the recording. I see you've already done that in your sample code.
Secondly, in the recordAndRecognizeSpeech function, when 'recognitionTask' is started, if no speech was detected then 'result' will be nil and the error case is triggered. So, I tested for result != nil before attempting to assign the result.
So, the code for those two functions looks as follows:
1. Updated startButtonTapped:
#IBAction func startButtonTapped(_ sender: UIButton) {
if isRecording {
request.endAudio() // Added line to mark end of recording
audioEngine.stop()
if let node = audioEngine.inputNode {
node.removeTap(onBus: 0)
}
recognitionTask?.cancel()
isRecording = false
startButton.backgroundColor = UIColor.gray
} else {
self.recordAndRecognizeSpeech()
isRecording = true
startButton.backgroundColor = UIColor.red
}
}
And 2. Update within recordAndRecognizeSpeech from the recognitionTask = ... line:
recognitionTask = speechRecognizer?.recognitionTask(with: request, resultHandler: { (result, error) in
if result != nil { // check to see if result is empty (i.e. no speech found)
if let result = result {
let bestString = result.bestTranscription.formattedString
self.detectedTextLabel.text = bestString
var lastString: String = ""
for segment in result.bestTranscription.segments {
let indexTo = bestString.index(bestString.startIndex, offsetBy: segment.substringRange.location)
lastString = bestString.substring(from: indexTo)
}
self.checkForColoursSaid(resultString: lastString)
} else if let error = error {
self.sendAlert(message: "There has been a speech recognition error")
print(error)
}
}
})
I hope that helps you.
This will prevent two errors: The above mentioned Code=216 and the 'SFSpeechAudioBufferRecognitionRequest cannot be re-used' error.
Stop recognition with finish not with cancel
Stop audio
like so:
// stop recognition
recognitionTask?.finish()
recognitionTask = nil
// stop audio
request.endAudio()
audioEngine.stop()
audioEngine.inputNode.removeTap(onBus: 0) // Remove tap on bus when stopping recording.
P.S. audioEngine.inputNode seems to be no longer an optional value, therefore I used no if let construct.
hey i was getting the same error but now its working absoultely fine.hope this code helps to you too :).
import UIKit
import Speech
class SpeechVC: UIViewController {
#IBOutlet weak var slabel: UILabel!
#IBOutlet weak var sbutton: UIButton!
let audioEngine = AVAudioEngine()
let SpeechRecognizer : SFSpeechRecognizer? = SFSpeechRecognizer()
let request = SFSpeechAudioBufferRecognitionRequest()
var recognitionTask:SFSpeechRecognitionTask?
var isRecording = false
override func viewDidLoad() {
super.viewDidLoad()
self.requestSpeechAuthorization()
// Do any additional setup after loading the view, typically from a nib.
}
func recordAndRecognizeSpeech()
{
guard let node = audioEngine.inputNode else { return }
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { buffer , _ in
self.request.append(buffer)
}
audioEngine.prepare()
do
{
try audioEngine.start()
}catch
{
return print(error)
}
guard let myRecognizer = SFSpeechRecognizer() else {
return
}
if !myRecognizer.isAvailable
{
return
}
recognitionTask = SpeechRecognizer?.recognitionTask(with: request, resultHandler: { result, error in
if let result = result
{
let bestString = result.bestTranscription.formattedString
self.slabel.text = bestString
var lastString : String = ""
for segment in result.bestTranscription.segments
{
let indexTo = bestString.index(bestString.startIndex, offsetBy: segment.substringRange.location)
lastString = bestString.substring(from: indexTo)
}
}else if let error = error
{
print(error)
}
})
}
#IBAction func startAction(_ sender: Any) {
if isRecording == true
{
audioEngine.stop()
recognitionTask?.cancel()
isRecording = false
sbutton.backgroundColor = UIColor.gray
}
else{
self.recordAndRecognizeSpeech()
isRecording = true
sbutton.backgroundColor = UIColor.red
}
}
func cancelRecording()
{
audioEngine.stop()
if let node = audioEngine.inputNode
{
audioEngine.inputNode?.removeTap(onBus: 0)
}
recognitionTask?.cancel()
}
func requestSpeechAuthorization()
{
SFSpeechRecognizer.requestAuthorization { authStatus in
OperationQueue.main.addOperation {
switch authStatus
{
case .authorized :
self.sbutton.isEnabled = true
case .denied :
self.sbutton.isEnabled = false
self.slabel.text = "User denied access to speech recognition"
case .restricted :
self.sbutton.isEnabled = false
self.slabel.text = "Speech Recognition is restricted on this Device"
case .notDetermined :
self.sbutton.isEnabled = false
self.slabel.text = "Speech Recognition not yet authorized"
}
}
}
}
}
I had this error because I was running the app on the Simulator. Running on a regular device solves the issue.
I'm trying to run Text To Speech (AVSpeechSynthesizer) along with Speech To Text from Siri Kit, but I'm stuck with it.
My TTS works perfectly until I run the code to execute the STT, after that my TTS doesn't work anymore. I debugged the code and during the executing of the code, no errors happen, but my text is not transforming to speech. I think somehow my STT is disabling the output microphone and that's why the TTS doesn't transform the text to speech anymore, well, that's just a theory. Ops: My TTS stops working, but my STT works perfectly
Any tips?
Here's my viewController's code:
#IBOutlet weak var microphoneButton: UIButton!
//text to speech
let speechSynthesizer = AVSpeechSynthesizer()
//speech to text
private var speechRecognizer: SFSpeechRecognizer!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private var audioEngine = AVAudioEngine()
#IBAction func textToSpeech(_ sender: Any) {
if let word = wordTextField.text{
if !speechSynthesizer.isSpeaking {
//get current dictionary
let dictionary = fetchSelectedDictionary()
//get current language
let language = languagesWithCodes[(dictionary?.language)!]
let speechUtterance = AVSpeechUtterance(string: word)
speechUtterance.voice = AVSpeechSynthesisVoice(language: language)
speechUtterance.rate = 0.4
//speechUtterance.pitchMultiplier = pitch
//speechUtterance.volume = volume
speechSynthesizer.speak(speechUtterance)
}
else{
speechSynthesizer.continueSpeaking()
}
}
}
#IBAction func speechToText(_ sender: Any) {
if audioEngine.isRunning {
audioEngine.stop()
recognitionRequest?.endAudio()
microphoneButton.isEnabled = false
microphoneButton.setTitle("Start Recording", for: .normal)
} else {
startRecording()
microphoneButton.setTitle("Stop Recording", for: .normal)
}
}
func startRecording() {
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
}
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
self.wordTextField.text = result?.bestTranscription.formattedString
isFinal = (result?.isFinal)!
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.microphoneButton.isEnabled = true
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
wordTextField.text = "Say something, I'm listening!"
}
}
This line:
try audioSession.setMode(AVAudioSessionModeMeasurement)
is probably the reason. It can cause the volume to be throttled so low, that it sounds like it is off. Try:
try audioSession.setMode(AVAudioSessionModeDefault)
and see if it works.
Probably because your audiosession is in Record mode, You have 2 solutions, first would be to set your try audioSession.setCategory(AVAudioSessionCategoryRecord) to AVAudioSessionCategoryPlayAndRecord (This will work) but a cleaner way would be to get a separate function for saying something and then set your AVAudioSessionCategory to AVAudioSessionCategoryPlayback
Hope this helped.
I got this error while implementing speech to text:
Terminating app due to uncaught exception
'com.apple.coreaudio.avfaudio', reason: 'required condition is false:
_recordingTap == nil'
and:
ERROR: [0x1b2df5c40] >avae> AVAudioNode.mm:565: CreateRecordingTap:
required condition is false: _recordingTap == nil
Here is the code of my viewController:
public class ViewController: UIViewController, SFSpeechRecognizerDelegate {
// MARK: Properties
private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
#IBOutlet var textView : UITextView!
#IBOutlet var recordButton : UIButton!
// MARK: UIViewController
public override func viewDidLoad() {
super.viewDidLoad()
// Disable the record buttons until authorization has been granted.
recordButton.isEnabled = false
}
override public func viewDidAppear(_ animated: Bool) {
speechRecognizer.delegate = self
SFSpeechRecognizer.requestAuthorization { authStatus in
/*
The callback may not be called on the main thread. Add an
operation to the main queue to update the record button's state.
*/
OperationQueue.main.addOperation {
switch authStatus {
case .authorized:
self.recordButton.isEnabled = true
case .denied:
self.recordButton.isEnabled = false
self.recordButton.setTitle("User denied access to speech recognition", for: .disabled)
case .restricted:
self.recordButton.isEnabled = false
self.recordButton.setTitle("Speech recognition restricted on this device", for: .disabled)
case .notDetermined:
self.recordButton.isEnabled = false
self.recordButton.setTitle("Speech recognition not yet authorized", for: .disabled)
}
}
}
}
private func startRecording() throws {
// Cancel the previous task if it's running.
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else { fatalError("Audio engine has no input node") }
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to created a SFSpeechAudioBufferRecognitionRequest object") }
// Configure request so that results are returned before audio recording is finished
recognitionRequest.shouldReportPartialResults = true
// A recognition task represents a speech recognition session.
// We keep a reference to the task so that it can be cancelled.
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
var isFinal = false
if let result = result {
self.textView.text = result.bestTranscription.formattedString
isFinal = result.isFinal
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.recordButton.isEnabled = true
self.recordButton.setTitle("Start Recording", for: [])
}
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
textView.text = "(Go ahead, I'm listening)"
}
// MARK: SFSpeechRecognizerDelegate
public func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
if available {
recordButton.isEnabled = true
recordButton.setTitle("Start Recording", for: [])
} else {
recordButton.isEnabled = false
recordButton.setTitle("Recognition not available", for: .disabled)
}
}
// MARK: Interface Builder actions
#IBAction func recordButtonTapped() {
if audioEngine.isRunning {
audioEngine.stop()
recognitionRequest?.endAudio()
recordButton.isEnabled = false
recordButton.setTitle("Stopping", for: .disabled)
} else {
try! startRecording()
recordButton.setTitle("Stop recording", for: [])
}
}
}
You can try to use it on stop recording
Swift 3:
audioEngine.inputNode?.removeTap(onBus: 0)
It's helped me and should help you too.
You probably already have a tap on the bus and you can't have another one on that same bus. You should removeTapOnBus when you stop your engine.
audioEngine.inputNode?.removeTap(onBus: 0)
The error is telling you that you already have a tap installed on that bus and that you can't have another one.
First you have to remove the tap for that bus. Then again you can install tap on the bus.
let inputNode = audioEngine.inputNode
inputNode.removeTap(onBus: 0)
It will help.