Apple's pinyin ranking algorithm - ios

I'm currently developing an English to Chinese dictionary app to learn iOS development and I'm kind of stuck as to ranking the more commonly used characters in Chinese when the user searches it in pinyin.
My question is:
Is there some way that I can use Apple's ranking algorithm for how they rank the Chinese character that come up when pinyin is typed (as they do a pretty good job at producing the right Chinese character)? Or is there some other way whereby I can achieve this?

If you want convert Chinese character to pinyin, you may use:
CFString​Transform or PinYin4Objc.
If you want first letter of pinyin, you can use pinyinFirstLetter .
If you just sort in Pinyin alphabetical order,you can use
sortedArray = [array sortedArrayUsingSelector:#selector(localizedCaseInsensitiveCompare:)];
Note: Polyphone, place name may not get right.
Edit:
It seems something like auto complete:
How to create an efficient auto-complete?
Implementing Autocomplete in iOS
Hope it can help you.

Related

Translating and localizing technical words into other languages

I'm currently translating a website from english into other languages but have a problem when it comes to technical terms (non words) like "crontab".
Should I keep the english translation or is there another way to find the equivalent?
These aren't actually english words and when it comes to languages like Japanese, I'm at a loss as to what to do.
Here's an example sentence as an example:
"Use crontab to schedule scripts."
which translated into Japanese via Google Translate becomes:
"スクリプトをスケジュールするcrontabを使用してください。"
You can see how bizarre this looks, and I'm wondering if the sentence could even be understood by a Japanese speaker.
What do I do in these situations?
Using English words in Japanese
Talking about the word crontab, I think it's not bizarre to write it in English in a Japanese sentence like this:
crotabを使用してください
(please use crontab)
On Japanese wikipedia, you can see how crontab is used without translating into Japanese.
http://ja.wikipedia.org/wiki/Crontab
In Japanese technical writing, especially when you mention name of tools, it is common to use English as it is without translating into Japanese.
Using Katakana
You could also write the sentence like below using Katakana.
クーロンタブを使用してください
(please use crontab).
Japanese usually writes words from English in Katakana. Japanese Katakana is phonetic, in other words each character represents a sound (not meaning). But In this case, it doesn't look natural.
Mistranslation
There is a mistranslation in your Japanese sentence.
スクリプトをスケジュールするcrontabを使用してください。
(Please use crontab which scedule a script.)
To correct this, you could go like this:
スクリプトをスケジュールするには、crontabを使用してください。
(In order to schedule a script, please use crontab.)
Hope this helps.

Trouble in openEars to recognize letter [duplicate]

I used the OpenEars for my app.just recognize "a" to "z" in the alphabet.
But it had a bad recognition in recognize alphabet than word.
So, how can i use my sound model to improve the recognition of OpenEars.
And how can I use OpenEars to recognize some special sound.
for example. I give OpenEars a dog sound and I want it to give me back "dog"
So this is a two part question which might be better to the community split up. OpenEars from what I understand is best served as using words in the dictionary. If you want it to recognize alphabet letters I would try and use the phonetic spelling of each letter instead of using just the letter. So instead of using 'f' use "ef".
As for the second part of the question, you might be able to recognize specific types of dogs which go "ruff" but smaller dogs with more of a "yip!" would have to be added to the initial dictionary as well.
I would get the demo app and really just experiment with these words.

How can I adjust OpenEars wrong recognition

I used the OpenEars for my app.just recognize "a" to "z" in the alphabet.
But it had a bad recognition in recognize alphabet than word.
So, how can i use my sound model to improve the recognition of OpenEars.
And how can I use OpenEars to recognize some special sound.
for example. I give OpenEars a dog sound and I want it to give me back "dog"
So this is a two part question which might be better to the community split up. OpenEars from what I understand is best served as using words in the dictionary. If you want it to recognize alphabet letters I would try and use the phonetic spelling of each letter instead of using just the letter. So instead of using 'f' use "ef".
As for the second part of the question, you might be able to recognize specific types of dogs which go "ruff" but smaller dogs with more of a "yip!" would have to be added to the initial dictionary as well.
I would get the demo app and really just experiment with these words.

Determine if a string is English

Is there a library where I can simple call a method on a string to find out if it is non-English? I'm trying to only save English strings and the incoming stream of strings has plenty of non-English in them.
You can try to use linguo.
"your string".lang
# will return "en" for english strings
Disclaimer: I'm the creator of this gem.
You can use GoogleTranslate API with the RailsBridge for it - http://code.google.com/apis/gdata/articles/gdata_on_rails.html
Not that I'm aware... but you could get this list into an array (http://www.langmaker.com/wordlist/basiclex.htm) and then match the string's words against it... Decide on some percentage as good, and go from there.
You could even use bayesian algorithm here to mark those words as "good" and learn from there, but that might be overkill.

UITextChecker is what dictionary?

Does anybody know what dictionary UITextChecker pulls from? I use it to verify that a word is in fact a valid word in an app. I have some questions from users about why specific words are available in other games (Boggle/Scrabble) but not in mine.
Examples: ai, qi, qat, xu, ae, tae, ait, ain, lav, aa, shh, za
I checked against /usr/share/dict/words and none of these words are in Websters Second International, so maybe UITextChecker uses this same source? They do show up in other dictionaries online (but this is really besides the point of the post).
Thanks for any insight!
UITextChecker may be using the same dictionary that UIReferenceLibraryViewController uses. In which case, you could use something like [UIReferenceLibraryViewController dictionaryHasDefinitionForTerm: #"term"] and if it returns true the word exists. I'm not sure how complete the built in dictionary is however.
I guess it uses the iPhone dictionary of the user, which depends on the current language/NSLocale the user is using (which is set in the "International" Settings on the iPhone). This is the behavior we observe when typing some text anywhere in the iPhone, words underlined in read (because detected by the internal UITextChecker) depends on the locale used.
If the user have activated multiple keyboards with different languages each (e.g. a French AZERTY keyboard and an US QWERTY keyboard) it depends obviously on the current language, namely the current keyboard active at this moment.
If you refer to the wordfeud dictionary... (that would be the only game I know those words from). They check their words from an online dictionary on their own server. Must be a list parsed from another spelling site or something.
I sometimes doubt the validity of some words though....

Resources