How to provide hint to iOS speech recognition API? - ios

I want to create an app that receive voice input using iOS speech API.
In google's API, there is an option for speechContext which I can provide hint or bias to some uncommon words.
Do iOS API provide this feature? I've been searching the site for a while but din't find any.

there is no sample code about implementing hints for Google Speech Clouds for Swift online, so I made it up!
Open this class: SpeechRecognitionService.swift
You have to add your hint list array to the SpeechContext, add the SpeechContext to RecognitionConfig, and finally add RecognitionConfig to Streaming recognition config. Like this:
let recognitionConfig = RecognitionConfig()
recognitionConfig.encoding = .linear16
recognitionConfig.sampleRateHertz = Int32(sampleRate)
recognitionConfig.languageCode = "en-US"
recognitionConfig.maxAlternatives = 3
recognitionConfig.enableWordTimeOffsets = true
let streamingRecognitionConfig = StreamingRecognitionConfig()
streamingRecognitionConfig.singleUtterance = true
streamingRecognitionConfig.interimResults = true
//Custom vocabulary (Hints) code
var phraseArray=NSMutableArray(array: ["my donkey is yayeerobee", "my horse is tekkadan", "bet four for kalamazoo"])
var mySpeechContext = SpeechContext.init()
mySpeechContext.phrasesArray=phraseArray
recognitionConfig.speechContextsArray = NSMutableArray(array: [mySpeechContext])
streamingRecognitionConfig.config = recognitionConfig
//Custom vocabulary (Hints) code
let streamingRecognizeRequest = StreamingRecognizeRequest()
streamingRecognizeRequest.streamingConfig = streamingRecognitionConfig
Bonus: Adding your custom words mixed inside a simple phrase instead of adding the word alone gave me better results.

Related

Search Bar AutoComplete by Address and Place Name - Swift 4

Goal
I want to make my search box auto-complete detecting place name and address.
Currently, my search box auto-complete detecting only the address.
I used this one in my iOS app. Result is looking good for addresses but not place name.
https://github.com/shmidt/GooglePlacesSearchController
I want it to detect places, like B&H Photo in NYC and so on ?
Example what I'm trying to do
Is it a paid service that I need to enable on Google API Console Library ?
Can someone sheds some lights on this ?
Just change your GooglePlacesSearchController initialization parameter placeType to .noFilter
let controller = GooglePlacesSearchController(
delegate: self, apiKey: GoogleMapsAPIServerKey, placeType: .noFilter)
For anyone with the same question but different implementation , just add a filter to your GMSAutocompleteViewController
let autocompleteController = GMSAutocompleteViewController()
autocompleteController.delegate = self
// Specify a filter.
let filter = GMSAutocompleteFilter()
filter.type = .noFilter
autocompleteController.autocompleteFilter = filter

Is it possible to integrate Google translation in iOS App programmatically (Swift 4)?

I have to translate the data coming from API in my app. For which I need to integrate Google Translate or something that can translate the data that is coming from backend.
How do I start coding this?
There is no specific SDK for the iOS platform for Google translation.
However, you could achieve it by "manually" requesting the translation API(s). For more information, check the Google Cloud Translation API Documentation.
Furthermore: Using ROGoogleTranslate, might save some time, therefore you would be able to do it like this:
var params = ROGoogleTranslateParams(source: "en",
target: "de",
text: "The sentence to be translated")
let translator = ROGoogleTranslate(with: "API Key here")
translator.translate(params: params) { (result) in
print("Translation: \(result)")
}
You can use google Translator URL to translate text inside your Application.
Follow the example.
let selected_language = "en"
let target_language = "hi"
let YourString = "hello"
let GoogleUrl = "https://translate.googleapis.com/translate_a/single?client=gtx&sl=" + selected_language + "&tl=" + target_language + "&dt=t&dt=t&q=" + YourString
After creating the GoogleUrl perform a Get Request from this URL using urlSession or Alamofire this url will return you json responce of translated text.

How to detect text (string) language in iOS?

For instance, given the following strings:
let textEN = "The quick brown fox jumps over the lazy dog"
let textES = "El zorro marrón rápido salta sobre el perro perezoso"
let textAR = "الثعلب البني السريع يقفز فوق الكلب الكسول"
let textDE = "Der schnelle braune Fuchs springt über den faulen Hund"
I want to detect the used language in each of them.
Let's assume the signature for the implemented function is:
func detectedLanguage<T: StringProtocol>(_ forString: T) -> String?
returns an Optional string in case of no detected language.
thus the appropriate result would be:
let englishDetectedLanguage = detectedLanguage(textEN) // => English
let spanishDetectedLanguage = detectedLanguage(textES) // => Spanish
let arabicDetectedLanguage = detectedLanguage(textAR) // => Arabic
let germanDetectedLanguage = detectedLanguage(textDE) // => German
Is there an easy approach to achieve it?
Latest versions (iOS 12+)
Briefly:
You could achieve it by using NLLanguageRecognizer, as:
import NaturalLanguage
func detectedLanguage(for string: String) -> String? {
let recognizer = NLLanguageRecognizer()
recognizer.processString(string)
guard let languageCode = recognizer.dominantLanguage?.rawValue else { return nil }
let detectedLanguage = Locale.current.localizedString(forIdentifier: languageCode)
return detectedLanguage
}
Older versions (iOS 11+)
Briefly:
You could achieve it by using NSLinguisticTagger, as:
func detectedLanguage<T: StringProtocol>(for string: T) -> String? {
let recognizer = NLLanguageRecognizer()
recognizer.processString(String(string))
guard let languageCode = recognizer.dominantLanguage?.rawValue else { return nil }
let detectedLanguage = Locale.current.localizedString(forIdentifier: languageCode)
return detectedLanguage
}
Details:
First of all, you should be aware of what are you asking about is mainly relates to the world of Natural language processing (NLP).
Since NLP is more than text language detection, the rest of the answer will not contains specific NLP information.
Obviously, implementing such a functionality is not that easy, especially when starting to care about the details of the process such as splitting into sentences and even into words, after that recognising names and punctuations etc... I bet you would think of "what a painful process! it is not even logical to do it by myself"; Fortunately, iOS does supports NLP (actually, NLP APIs are available for all Apple platforms, not only the iOS) to make what are you aiming for to be easy to be implemented. The core component that you would work with is NSLinguisticTagger:
Analyze natural language text to tag part of speech and lexical class,
identify names, perform lemmatization, and determine the language and
script.
NSLinguisticTagger provides a uniform interface to a variety of
natural language processing functionality with support for many
different languages and scripts. You can use this class to segment
natural language text into paragraphs, sentences, or words, and tag
information about those segments, such as part of speech, lexical
class, lemma, script, and language.
As mentioned in the class documentation, the method that you are looking for - under Determining the Dominant Language and Orthography section- is dominantLanguage(for:):
Returns the dominant language for the specified string.
.
.
Return Value
The BCP-47 tag identifying the dominant language of the string, or the
tag "und" if a specific language cannot be determined.
You might notice that the NSLinguisticTagger is exist since back to iOS 5. However, dominantLanguage(for:) method is only supported for iOS 11 and above, that's because it has been developed on top of the Core ML Framework:
. . .
Core ML is the foundation for domain-specific frameworks and
functionality. Core ML supports Vision for image analysis, Foundation
for natural language processing (for example, the NSLinguisticTagger
class), and GameplayKit for evaluating learned decision trees. Core ML
itself builds on top of low-level primitives like Accelerate and BNNS,
as well as Metal Performance Shaders.
Based on the returned value from calling dominantLanguage(for:) by passing "The quick brown fox jumps over the lazy dog":
NSLinguisticTagger.dominantLanguage(for: "The quick brown fox jumps over the lazy dog")
would be "en" optional string. However, so far that is not the desired output, the expectation is to get "English" instead! Well, that is exactly what you should get by calling the localizedString(forLanguageCode:) method from Locale Structure and passing the gotten language code:
Locale.current.localizedString(forIdentifier: "en") // English
Putting all together:
As mentioned in the "Quick Answer" code snippet, the function would be:
func detectedLanguage<T: StringProtocol>(_ forString: T) -> String? {
guard let languageCode = NSLinguisticTagger.dominantLanguage(for: String(forString)) else {
return nil
}
let detectedLanguage = Locale.current.localizedString(forIdentifier: languageCode)
return detectedLanguage
}
Output:
It would be as expected:
let englishDetectedLanguage = detectedLanguage(textEN) // => English
let spanishDetectedLanguage = detectedLanguage(textES) // => Spanish
let arabicDetectedLanguage = detectedLanguage(textAR) // => Arabic
let germanDetectedLanguage = detectedLanguage(textDE) // => German
Note That:
There still cases for not getting a language name for a given string, like:
let textUND = "SdsOE"
let undefinedDetectedLanguage = detectedLanguage(textUND) // => Unknown language
Or it could be even nil:
let rubbish = "000747322"
let rubbishDetectedLanguage = detectedLanguage(rubbish) // => nil
Still find it a not bad result for providing a useful output...
Furthermore:
About NSLinguisticTagger:
Although I will not going to dive deep in NSLinguisticTagger usage, I would like to note that there are couple of really cool features exist in it more than just simply detecting the language for a given a text; As a pretty simple example: using the lemma when enumerating tags would be so helpful when working with Information retrieval, since you would be able to recognize the word "driving" passing "drive" word.
Official Resources
Apple Video Sessions:
For more about Natural Language Processing and how NSLinguisticTagger works: Natural Language Processing and your Apps.
Also, for getting familiar with the CoreML:
Introducing Core ML.
Core ML in depth.
You can use NSLinguisticTagger's tagAt method. It support iOS 5 and later.
func detectLanguage<T: StringProtocol>(for text: T) -> String? {
let tagger = NSLinguisticTagger.init(tagSchemes: [.language], options: 0)
tagger.string = String(text)
guard let languageCode = tagger.tag(at: 0, scheme: .language, tokenRange: nil, sentenceRange: nil) else { return nil }
return Locale.current.localizedString(forIdentifier: languageCode)
}
detectLanguage(for: "The quick brown fox jumps over the lazy dog") // English
detectLanguage(for: "El zorro marrón rápido salta sobre el perro perezoso") // Spanish
detectLanguage(for: "الثعلب البني السريع يقفز فوق الكلب الكسول") // Arabic
detectLanguage(for: "Der schnelle braune Fuchs springt über den faulen Hund") // German
I tried NSLinguisticTagger with short input text like hello, it always recognizes as Italian.
Luckily, Apple recently added NLLanguageRecognizer available on iOS 12, and seems like it more accurate :D
import NaturalLanguage
if #available(iOS 12.0, *) {
let languageRecognizer = NLLanguageRecognizer()
languageRecognizer.processString(text)
let code = languageRecognizer.dominantLanguage!.rawValue
let language = Locale.current.localizedString(forIdentifier: code)
}

AVSpeechSynthesizer High Quality Voices

Is it possible to use the enhanced/high quality voices (Alex in the U.S.) with the speech synthesizer? I have downloaded the voices but find no way to tell the synthesizer to use it rather than the default voice.
Since voices are generally selected by BCP-47 codes and there is only on for US English, it appears there is no way to further differentiate voices. Am I missing something? (One would think Apple might have considered a need for different dialects, but I am not seeing it).
TIA.
Yes, possible to pick from the 2 that seem to be available on my system, like this:
class Speak {
let voices = AVSpeechSynthesisVoice.speechVoices()
let voiceSynth = AVSpeechSynthesizer()
var voiceToUse: AVSpeechSynthesisVoice?
init(){
for voice in voices {
if voice.name == "Samantha (Enhanced)" && voice.quality == .enhanced {
voiceToUse = voice
}
}
}
func sayThis(_ phrase: String){
let utterance = AVSpeechUtterance(string: phrase)
utterance.voice = voiceToUse
utterance.rate = 0.5
voiceSynth.speak(utterance)
}
}
Then, somewhere in your app, do something like this:
let voice = Speak()
voice.sayThis("I'm speaking better Seppo, now!")
This was a bug in the previous versions of iOS that the apps using the synthesiser weren't using the enhanced voices. This bug has been fixed in iOS10. iOS10 now uses the enhanced voices.

Customizing subtitles with AVPlayer

I was able to display a subtitle track with AVPlayer on iOS 6, but I am not able to customize it. It just shows the same style (a small font size, in white).
Here's how I select the subtitle:
AVMediaSelectionGroup *subtitle = [asset mediaSelectionGroupForMediaCharacteristic: AVMediaCharacteristicLegible];
[self.videoPlayer.currentItem selectMediaOption:subtitle.options[0] inMediaSelectionGroup: subtitle];
And how I'm trying to customize the subtitle:
AVTextStyleRule *rule = [[AVTextStyleRule alloc] initWithTextMarkupAttributes:#{
(id)kCMTextMarkupAttribute_ForegroundColorARGB : #[ #1, #1, #0, #0 ],
(id) kCMTextMarkupAttribute_ItalicStyle : #(YES)}];
self.videoPlayer.currentItem.textStyleRules = #[rule];
No matter if I put this snippet before or after selecting the subtitle, the result is the same.
The AVPlayer is created with a local (file) URL (a mp4 file).
Any thoughts on how to do this?
I asked this question on Apple Developer Forums and I got an answer from an Apple employee:
The textStyleRules property only applies to WebVTT content. Your local file is probably carrying subtitles in TX3G format.
You're right that the documentation doesn't mention this limitation, so you should file a bug so we can get our documentation updated.
So, I'll open a radar to ask that they update the docs and I'll post its number here if someone wants to dupe it.
EDIT:
I created rdar://14923673 to ask Apple to update the docs about this current limitation. I also created rdar://14923755 to ask them to provide support to subtitles in TX3G format.
Please dupe them if you're affected by this issue.
I found workaround to modify text foreground color and background correctly.
Just separate styles to multiple AVTextStyleRule.
func initSubtitleStyle() {
let textStyle:AVTextStyleRule = AVTextStyleRule(textMarkupAttributes: [
kCMTextMarkupAttribute_CharacterBackgroundColorARGB as String: [0.2,0.3,0.0,0.3]
])!
let textStyle1:AVTextStyleRule = AVTextStyleRule(textMarkupAttributes: [
kCMTextMarkupAttribute_ForegroundColorARGB as String: [0.2,0.8,0.4,0.0]
])!
let textStyle2:AVTextStyleRule = AVTextStyleRule(textMarkupAttributes: [
kCMTextMarkupAttribute_BaseFontSizePercentageRelativeToVideoHeight as String: 20,
kCMTextMarkupAttribute_CharacterEdgeStyle as String: kCMTextMarkupCharacterEdgeStyle_None
])!
player.currentItem?.textStyleRules = [textStyle, textStyle1, textStyle2]
}
please do not ask me why, that solution come from try and error XD

Resources