i am implementing speech to text by OpenEars feature in my app.
i am also using Rejecto plugin to make the recognition better and RapidEars for faster results. the goal is to detect phrase and single words, for example :
lmGenerator = [[LanguageModelGenerator alloc] init];
NSArray *words = [NSArray arrayWithObjects:#"REBETANDEAL",#"NEWBET",#"REEEBET", nil];
NSString *name = #"NameIWantForMyLanguageModelFiles";
NSError *err = [lmGenerator generateRejectingLanguageModelFromArray:words
withFilesNamed:name
withOptionalExclusions:nil
usingVowelsOnly:FALSE
withWeight:nil
forAcousticModelAtPath:[AcousticModel pathToModel:#"AcousticModelEnglish"]]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to create a Spanish Rejecto model.
// Change "AcousticModelEnglish" to "AcousticModelSpanish" to create a Spanish language model instead of an English one.
NSDictionary *languageGeneratorResults = nil;
NSString *lmPath = nil;
NSString *dicPath = nil;
if([err code] == noErr) {
languageGeneratorResults = [err userInfo];
lmPath = [languageGeneratorResults objectForKey:#"LMPath"];
dicPath = [languageGeneratorResults objectForKey:#"DictionaryPath"];
} else {
NSLog(#"Error: %#",[err localizedDescription]);
}
// Change "AcousticModelEnglish" to "AcousticModelSpanish" to perform Spanish recognition instead of English.
[self.pocketsphinxController setRapidEarsToVerbose:FALSE]; // This defaults to FALSE but will give a lot of debug readout if set TRUE
[self.pocketsphinxController setRapidEarsAccuracy:10]; // This defaults to 20, maximum accuracy, but can be set as low as 1 to save CPU
[self.pocketsphinxController setFinalizeHypothesis:TRUE]; // This defaults to TRUE and will return a final hypothesis, but can be turned off to save a little CPU and will then return no final hypothesis; only partial "live" hypotheses.
[self.pocketsphinxController setFasterPartials:TRUE]; // This will give faster rapid recognition with less accuracy. This is what you want in most cases since more accuracy for partial hypotheses will have a delay.
[self.pocketsphinxController setFasterFinals:FALSE]; // This will give an accurate final recognition. You can have earlier final recognitions with less accuracy as well by setting this to TRUE.
[self.pocketsphinxController startRealtimeListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[AcousticModel pathToModel:#"AcousticModelEnglish"]]; // Starts the rapid recognition loop. Change "AcousticModelEnglish" to "AcousticModelSpanish" in order to perform Spanish language recognition.
[self.openEarsEventsObserver setDelegate:self];
most of the time the result is fine, but sometime it makes a mix from separate strings objects. for example i pass the words array : #[#"ME AND YOU",#"YOU",#"ME"] and the output can be : "YOU ME ME ME AND". i dont want it to recognize only part of a phrase.
any ideas please?
On the pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID you can check if the hypothesis is in your words array before showing it.
- (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {
if ([words containsObject:hypothesis]) {
//show hypothesis
}
}
OpenEars developer here. To detect fixed phrases using OpenEars, use the new dynamic grammar generator method of LanguageModelGenerator to create a rules-based grammar dynamically rather than a statistical language model: http://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/
Related
NSMutableString *pinyin = [[NSMutableString alloc] initWithString:#"你好"];
if (CFStringTransform((__bridge CFMutableStringRef)pinyin, 0, kCFStringTransformMandarinLatin, NO)) {
NSLog(#"%#", pinyin);
// nǐ hǎo
}
if (CFStringTransform((__bridge CFMutableStringRef)pinyin, 0, kCFStringTransformStripDiacritics, NO)) {
NSLog(#"%#", pinyin);
// ni hao
}
From the first if statement above code, I can get the pinyin with the phonetic alphabet.The second if statement, I can get the pinyin which don't carry phonetic alphabet(like ni hao).
But the problem is that I want is phonetic. For example, 'hǎo' represents a third tone in pinyin. And the third tone is what I need.
I use Google to search for a long time, but did not find a correlation method.
Please let me know if any open source or method or other already present for this.
Thanks in advance.
Maybe this can help you :PinYin4Objc.
You can set the format ToneType like this [format setToneType:ToneTypeWithToneNumber], and you'll get the result "ni3 hao3"
We have code like the following to retrieved the user language preference:
NSString *language = [[NSLocale preferredLanguages] firstObject];
Before iOS 8.4, language is "zh-Hans", "de", "ru", "ja" and etc. But since iOS 9, I notice that there is additional three characters "-US" appended to language. For example, "zh-Hans" becomes "zh-Hans-US"
I can find any documentation about this change. I assume that I could do something like the following to workaround this issue.
NSRange range = [language rangeOfString:#"-US"];
if (range.location!=NSNotFound && language.length==range.location+3) {
// If the last 3 chars are "-US", remove it
language = [language substringToIndex:range.location];
}
However, I am not sure whether it is safe to do so. It seems that "-US" is the location where the user is using the app? But this doesn't really make sense because we are in Canada. Has any body from other part of the world tried this?
Apple has started adding regions onto the language locales in iOS 9. Per Apple's docs, it has a fallback mechanism now if no region is specified. If you need to only support some languages, here is how I worked around it, per Apple's docs suggestion:
NSArray<NSString *> *availableLanguages = #[#"en", #"es", #"de", #"ru", #"zh-Hans", #"ja", #"pt"];
self.currentLanguage = [[[NSBundle preferredLocalizationsFromArray:availableLanguages] firstObject] mutableCopy];
This will automatically assign one of the languages in the array based off the User's language settings without having to worry about regions.
Source: Technical Note TN2418
To extract the region I think this is a better solution:
// Format is Lang - Region
NSString *fullString = [[NSLocale preferredLanguages] firstObject];
NSMutableArray *langAndRegion = [NSMutableArray arrayWithArray:[fullString componentsSeparatedByString:#"-"]];
// Region is the last item
NSString *region = [langAndRegion objectAtIndex:langAndRegion.count - 1];
// We remove region
[langAndRegion removeLastObject];
// We recreate array with the lang
NSString *lang = [langAndRegion componentsJoinedByString:#"-"];
Swift 5: Remove region from preferred language
Using Locale.preferredLanguages.first gives you the preferred App language (which can be different than device language for the user).
In order to support the script code and language code (but to remove the region code) I think it is best to create a locale given the preferred language and grab the information we need from there.
if let pref = Locale.preferredLanguages.first {
let locale = Locale(identifier: pref)
let code = [locale.languageCode, locale.scriptCode].compactMap{$0}.joined(separator: "-")
print(code)
}
So first we get the preferred app language, Then create a locale from the language.
To get the language code we create an array with locale.languageCode and the locale.scriptCode (which may be nil), remove any nil values with compactMap and then join the values with a "-".
This should allow support for Simplified Chinese and Traditional, and let Apple handle the region instead of assuming it will always be there.
the text that caused the crash is the following:
the error occurred at the following line:
let size = CGSize(width: 250, height: DBL_MAX)
let font = UIFont.systemFontOfSize(16.0)
let attributes = [
NSFontAttributeName:font ,
NSParagraphStyleAttributeName: paraStyle
]
var rect = text.boundingRectWithSize(size, options:.UsesLineFragmentOrigin, attributes: attributes, context: nil)
where text variable contains the inputted string
parastyle is declared as follows:
let paraStyle = NSMutableParagraphStyle()
paraStyle.lineBreakMode = NSLineBreakMode.ByWordWrapping
My initial idea is that the system font can't handle these characters and I need to do an NSCharacterSet, but I'm not sure how to either just ban characters that'll crash my app or make it so i can handle this input (ideal). I don't want to ban emojis/emoticons either.
Thanks!
Not an answer but some information and that possibly provids a way code way to avoid it.
Updated to information from The Register:
The problem isn’t with the Arabic characters themselves, but in how the unicode representing them is processed by CoreText, which is a library of software routines to help apps display text on screens.
The bug causes CoreText to access memory that is invalid, which forces the operating system to kill off the currently running program: which could be your text message app, your terminal, or in the case of the notification screen, a core part of the OS.
From Reddit but this may not be completely correct:
It only works when the message has to be abbreviated with ‘…’. This is usually on the lock screen and main menu of Messages.app.
The words effective and power can be anything as long as they’re on two different lines, which forces the Arabic text farther down the message where some of the letters will be replaced with ‘…’
The crash happens when the first dot replaces part of one of the Arabic characters (they require more than one byte to store) Normally there are safety checks to make sure half characters aren’t stored, but this replacement bypasses those checks for whatever reason.
My solution is the next category:
static NSString *const CRASH_STRING = #"\u0963h \u0963 \u0963";
#implementation NSString (CONEffectivePower)
- (BOOL)isDangerousStringForCurrentOS
{
if (IS_IOS_7_OR_LESS || IS_IOS_8_4_OR_HIGHER) {
return NO;
}
return [self containsEffectivePowerText];
}
- (BOOL)containsEffectivePowerText
{
return [self containsString:CRASH_STRING];
}
#end
Filter all characters to have same directionality. Unfortunately, I'm only aware of such API in Java.
Don't even try. This is a bug in the operating system that will be fixed. It's not your problem. If you try to fix it, you are just wasting your time. And you are very likely to introduce bugs - when you say you "sanitise" input that means you cannot handle some perfectly fine input.
The company I work at develops a multiplatform group video chat.
In Crashlytics report we started noticing that some users are "effectively" trolling iOS users using this famous unicode sequence.
We can't just sit and wait for Apple to fix this bug.
So, I've worked on this problem, this is the shortest crashing sequence I got:
// unichar representation
unichar crashChars[8] = {1585, 1611, 32, 2403, 32, 2403, 32, 2403};
// string representation
NSString *crashString = #"\u0631\u064b \u0963 \u0963 \u0963"
So, I decided to filter out all text messages that contains two U+0963 'ॣ' symbols with one symbol between them (hope you are able to decipher this phrase)
My code from NSString+Extensions category.
static const unichar kDangerousSymbol = 2403;
- (BOOL)isDangerousUnicode {
NSUInteger distance = 0;
NSUInteger charactersFound = 0;
for (NSUInteger i = 0; i < self.length; i++) {
unichar character = [self characterAtIndex:i];
if (charactersFound) {
distance++;
}
if (distance > 2) {
charactersFound = 0;
}
if (kDangerousSymbol == character) {
charactersFound++;
}
if (charactersFound > 1 && distance > 0) {
return YES;
}
}
return NO;
}
Lousy Specta test:
SpecBegin(NSStringExtensions)
describe(#"NSString+Extensions", ^{
//....
it(#"should detect dangerous Unicode sequences", ^{
expect([#"\u0963 \u0963" isDangerousUnicode]).to.beTruthy();
expect([#"\u0631\u064b \u0963 \u0963 \u0963" isDangerousUnicode]).to.beTruthy();
expect([#"\u0631\u064b \u0963 \u0963 \u0963" isDangerousUnicode]).to.beFalsy();
});
//....
});
SpecEnd
I'm not sure if it's OK to "discriminate" messages with too many "devanagari vowel sign vocalic ll".
I'm open to corrections, suggestions, criticism :).
I would love to see a better solution to this problem.
I'm new to IOS / Objective C and am trying to figure out the best way to create a collection, iterate over it, and time events.
I have a series of lines of a song and I want an individual line of a song to appear on the screen as the music is playing at the right point in the song. So I've started by doing the following: I put the individual lines into a Dictionary and the millisecond value of when the line should appear.
NSDictionary *bualadhBos = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:2875], #"Muid uilig ag bualadh bos, ",
[NSNumber numberWithInt:3407], #"Muid uilig ag tógáil cos, ",
[NSNumber numberWithInt:3889], #"Muid ag déanamh fead ghlaice, ",
[NSNumber numberWithInt:4401], #"Muid uilig ag geaibíneacht. ",
[NSNumber numberWithInt:4900], #"Buail do ghlúine 1, 2, 3, ",
[NSNumber numberWithInt:5383], #"Buail do bholg mór buí, ",
[NSNumber numberWithInt:5910], #"Léim suas, ansin suigh síos, ",
[NSNumber numberWithInt:6435], #"Seasaigh suas go hard arís. ",
[NSNumber numberWithInt:6942], #"Sín amach do dhá lamh, ",
[NSNumber numberWithInt:7430], #"Anois lig ort go bhfuil tú ' snámh. ",
[NSNumber numberWithInt:7934], #"Amharc ar dheis, ansin ar chlé, ",
[NSNumber numberWithInt:8436], #"Tóg do shúile go dtí an spéir. ",
[NSNumber numberWithInt:8940], #"Tiontaigh thart is thart arís, ",
[NSNumber numberWithInt:9436], #"Cuir síos do dhá lámh le do thaobh, ",
[NSNumber numberWithInt:9942], #"Lámha suas is lúb do ghlúin, ",
[NSNumber numberWithInt:10456], #"Suigí síos anois go ciúin. ", nil
];
then I wanted to iterate over the Dictionary, create a Timer that would call a method that is responsible for changing the text in the textLayer
for (id key in bualadhBos ) {
NSTimer *timer;
timer = [[NSTimer scheduledTimerWithTimeInterval:bualadhBos[key] target:self selector:#selector(changeText) userInfo:nil repeats:NO]];
}
-(void)changeText {
// change the text of the textLayer
textLayer.string = #"Some New Text";
}
But as I started to debug this and inspect how it might work, I noticed in the debugger that the order which the items appear in the Dictionary have all been shuffled around. I'm also concerned (I don't know enough about this) that I'm creating multiple timers and that there might be a more efficient approach to solving this problem.
Any direction would be greatly appreciated.
Dictionaries are sorted by definition in the way that is most suitable for hash algorithm so you should never rely on their order.
In your case it would be better to build a binary tree and have a single NSTimer that fires once a second, performs binary tree search and returns the closest string for provided time offset.
If you use AVFoundation or AVPlayer for playback. Then to synchronize subtitles with media playback, you could use something like addPeriodicTimeObserverForInterval to fire timer once a second and perform search in your binary tree and update UI.
In pseudo code:
[player addPeriodicTimeObserverForInterval:CMTimeMake(1, 1) queue:NULL usingBlock:^(CMTime time) {
// get playback time
NSTimeInterval seconds = CMTimeGetSeconds(time);
// search b-tree
NSString* subtitle = MyBtreeFindSubtitleForTimeInterval(seconds);
// update UI
myTextLabel.text = subtitle;
}];
I noticed in the debugger that the order which the items appear in the Dictionary have all been shuffled around.
Dictionaries are not ordered collections. Don't rely on the order of the elements being the same as the order that you added the elements. Don't rely on the order of the elements at all, in any respect.
If you want to access the elements of a dictionary in a certain order, create an array containing the keys in the order you prefer. Then iterate over the array and use each key to access the corresponding value in the array. In your case, you could get the array of keys, sort it into ascending order, and use that. Or, you could create an array of dictionaries with keys like "time" and "lyric", one for each line.
All that said, given your current code it's hard to see why you care need to access the elements in a particular order. If you're creating all the timers at once, as you're currently doing, things should work fine until the number of timers becomes a problem. I'm not sure where that point is, but I'm sure it's much greater than 20.
I'm creating multiple timers and that there might be a more efficient approach to solving this problem
Sure. You're creating one timer for each line all at once, so that many timers run concurrently. You could instead just create the first timer and have it's action create another timer when it fires, and so on for each line. To avoid timer drift, use the system time to calculate the appropriate delay to the time when the next line should be displayed.
being that PDFKit is not available on iOS, how is it possible to get the outline of a pdf document in that environment? Is commercial libraries like FastPdfKit or PSPDFKit the only solution?
It's not TOO tricky to access the pdf outline. My outline parser has about 420 LOC. I'll post some snippets, so you'll get the idea. I can't post the full code as it's a commercial library.
You basically start like this:
CGPDFDictionaryRef outlineRef;
if(CGPDFDictionaryGetDictionary(pdfDocDictionary, "Outlines", &outlineRef)) {
going down to
NSArray *outlineElements = nil;
CGPDFDictionaryRef firstEntry;
if (CGPDFDictionaryGetDictionary(outlineRef, "First", &firstEntry)) {
NSMutableArray *pageCache = [NSMutableArray arrayWithCapacity:CGPDFDocumentGetNumberOfPages(documentRef)];
outlineElements = [self parseOutlineElements:firstEntry level:0 error:&error documentRef:documentRef cache:pageCache];
}else {
PSPDFLogWarning(#"Error while parsing outline. First entry not found!");
}
you parse single items like this:
// parse title
NSString *outlineTitle = stringFromCGPDFDictionary(outlineElementRef, #"Title");
PSPDFLogVerbose(#"outline title: %#", outlineTitle);
if (!outlineTitle) {
if (error_) {
*error_ = [NSError errorWithDomain:kPSPDFOutlineParserErrorDomain code:1 userInfo:nil];
}
return nil;
}
NSString *namedDestination = nil;
CGPDFObjectRef destinationRef;
if (CGPDFDictionaryGetObject(outlineElementRef, "Dest", &destinationRef)) {
CGPDFObjectType destinationType = CGPDFObjectGetType(destinationRef);
The most annoying thing is that you have Named Destinations in most pdf documents, which need additional steps to resolve. I save those in an array and resolve them later.
It took quite a while to "get it right" as there are LOTS of differences in the PDFs that are around, and even if you implement everything in compliance to the PDF reference, some files won't work until you apply further tweaking. (PDF is a mess!)
It is now possible in iOS 11+.
https://developer.apple.com/documentation/pdfkit
You can get the PDFOutline of a PDFDocument.
The PDFOutline's outlineRoot will return outline items if there are any and NULL if none.