NSRange in Strings having dialects - ios

I was working on an app, which takes input in a language called "Tamil". So in order to find the range of any particular charater in the string i have used the below code.
var range = originalWord.rangeOfString("\(character)")
println("\(range.location)")
So this works fine except for some cases.
there are some characters like this -> í , ó . // am just saying an example.
So like this combination, in other languages there are several vowel diacritcs are there.
If i have this word "alv`in"
// which is alvin , but i used "v" with a dialect.
If i print the unicde value of these characters in xcode, i will get each unicode. But for "v`" there will be two unicode values but its considered as a single character.
So if i check this character in the above mentioned code. i get the folowing result. Which gives errors in my program.
range.location // 2147483647 , its not a single digit.? why.?
But for other characters its just prints the correct Int Value. // Single digit like "3"
Anybody have any idea of how to get this done.? How can i achieve this if i use characters with dialets
.?
code given below
// userInput = "இல்லம்"
var originalWord : NSString = ("இல்லம்")
var originalArray = Array("இல்லம்")
var userInputWord = Array(String(userInput))
// -------------------------------------------
for character in String(userInput)
{
switch character
{
case originalArray[0] :
// here matches first character of the userinput to the original word first character
// the character exists at the 0th index
var range = originalWord.rangeOfString("\(character)")
if range.location == 0
{
// same character in the same index
// correctValue increase by one (cow Value)
cowValue += 1
}
else
{
// same character but in the different index
// Wrong value increase by one (bull Value)
bullValue += 1
}
case originalArray[1] :
// here matches first character of the userinput to the original word first character
// the character exists at the 1th index
var range = originalWord.rangeOfString("\(character)")
println("\(range.location)") // here i get he long Int Value instead of single digit
if range.location == 1
{
// same character in the same index
// correctValue increase by one (cow Value)
cowValue += 1
}
else
{
// same character but in the different index
// Wrong value increase by one (bull Value)
bullValue += 1
}

You should use Swift strings instead of NSString, because Swift strings have
full Unicode support including composed character sequences, (extended) grapheme clusters etc.
For Swift strings, rangeOfString() returns an optional Range<String.Index>
which is a bit more complicated to handle. You can also use find() instead to
find the position of a character. This might help as a starting point:
var cowValue = 0
var bullValue = 0
let userInput = "இல்லம்"
let originalWord = "இல்லம்"
let originalArray = Array("இல்லம்")
for character in userInput {
switch character {
case originalArray[0] :
if let pos = find(originalWord, character) {
// Character found in string
println(pos)
if pos == originalWord.startIndex {
// At position 0
cowValue += 1
} else {
// At a different position
bullValue += 1
}
} else {
// Character not found in string
}
case originalArray[1] :
if let pos = find(originalWord, character) {
// Character found in string
println(pos)
if pos == advance(originalWord.startIndex, 1) {
// At position 1
cowValue += 1
} else {
// At a different position
bullValue += 1
}
} else {
// Character not found in string
}
default:
println("What ?")
}
}

Check out the documentation for NSString's rangeOfComposedCharacterSequenceAtIndex: and rangeOfComposedCharacterSequencesForRange:
You want to look for Composed Character Sequences, not individual characters.

Related

How Can I Construct an Efficient CoreData Search, Including Allowing For Preceding and Trailing Characters Here?

Based on straight SQL searches in a previous app, I am adding CoreData searching to a new app. These searches are in a custom dictionary db that the app contains; this function does the work:
public func wordMatcher (pad: Int, word: Array<String>, substitutes : Set<String> ) {
let context = CoreDataManager.shared.persistentContainer.viewContext
var query: Array<String>
var foundPositions : Set<Int> = []
var searchTerms : Array<String> = []
if word.count >= 4 {
for i in 0..<word.count {
for letter in substitutes {
query = word
query[i] = letter
searchTerms.append(query.joined())
let rq: NSFetchRequest<Word> = Word.fetchRequest()
rq.predicate = NSPredicate(format: "name LIKE %#", query.joined())
rq.fetchLimit = 1
do {
if try context.fetch(rq).count != 0 {
foundPositions.insert(i)
break
}
} catch {
}
}
// do aggregated searchTerms search here instead of individual searches?
}
}
}
The NSFetchRequest focuses on one permutation at a time. But I'm accumulating the search string fragments in the array searchTerms because I don't know if it would be more efficient to construct a single query connected with ORs, and I also don't know how to do that in CoreData.
The focus is on the positions in the original term word: I need to indicate if any given location has at least one of the substitutes as a valid fit. So to implement the aggregate searchTerms approach, a FetchRequest would have to happen for each location in the base term.
A second complication is the one referred to in the title of the question. I am using LIKE because the search term in the FetchRequest could be a substring in a longer word. However, the maximum number of letters is 11, and pad is the starting point of the original term in that field of 11 spaces.
So if pad is 3, then I would need to allow for 0..<pad preceding characters. And because there may be trailing characters, I would also want results with 0..<(11 - (pad + word.count)) alphabetic characters after the last letter in the search term.
Regex seems like one way to do this, but I haven't found a clear example of how to do this in this case, and especially with the multiple search terms (if that's the way to go). The limits of SQLite in the previous version forced constructing multiple queries with increasing numbers of "_" underscores to indicate the padding characters; that tended to really explode the number of queries.
BTW, substitutes is limited to an absolute maximum of 9 values, and in practice is usually below 5, so things are a little more manageable.
I would like to get a grip on this, and so if anyone can provide direction or examples that can make this a reasonably efficient function, the help is appreciated greatly.
EDIT:
I've realized that I need a result for each position in the target string, with cases where the leading and trailing spaces also may need to contain a substitute as well.
So I'm moving to this:
public func wordMatcher (pad: Int, word: Array<String>, substitutes : Set<String> ) {
let context = CoreDataManager.shared.persistentContainer.viewContext
var pad_ = pad
var query: Array<String>
var foundPositions : Set<Int> = []
let rq: NSFetchRequest<Word> = Word.fetchRequest()
rq.fetchLimit = 1
let subs = "[\(substitutes.joined())]"
// if word.count >= 4 { // because those locations will be blocked off anyway otherwise
let start = pad > 0 ? -1 : 0
let finish = 11 - (pad + word.count) > 0 ? word.count + 1 : word.count
for i in start..<finish {
query = word
var _pad = 11 - (pad + word.count)
if i == -1 {
query = Array(arrayLiteral: subs) + query
pad_ -= 1
} else if i > word.count {
query.append(subs)
_pad -= 1
} else {
pad_ = pad
query[i] = subs
}
let endPad = _pad > 0 ? "{0,\(_pad)}" : ""
let predMatch = ".\(query.joined())\(endPad)"
print(predMatch)
rq.predicate = NSPredicate(format:"position <= %# AND word MATCHES %#", pad_, predMatch)
do {
if try context.fetch(rq).count != 0 {
foundPositions.insert(i)
}
} catch {
}
// }
}
lFreq = foundPositions
}
This relies on a regex substitution, inserted into the original target string. What I'll have to find out is if this is fast enough at the edge cases, but it may not be critical even in the worst case.
predMatch will end up looking something like "ab[xyx]d{0,3}", and I think I can get rid of the position section by changing it to be "{0,2}ab[xyx]d{0,3}". But I guess I'm going to have to try to find out.

How to Check if String begins with Alphabet Letter in Swift 5?

Problem: i am currently trying to Sort a List in SwiftUI according to the Items First Character. I also would like to implement a Section for all Items, which doesn't begin with a Character of the Alphabet (Numbers, Special Chars).
My Code so far:
let nonAlphabetItems = items.filter { $0.name.uppercased() != /* beginns with A - Z */ }
Does anyone has a Solution for this Issue. Of course I could do a huge Loop Construct, however I hope there is a more elegant way.
Thanks for your help.
You can check if a string range "A"..."Z" contains the first letter of your name property:
struct Item {
let name: String
}
let items: [Item] = [.init(name: "Def"),.init(name: "Ghi"),.init(name: "123"),.init(name: "Abc")]
let nonAlphabetItems = items.filter { !("A"..."Z" ~= ($0.name.first?.uppercased() ?? "#")) }
nonAlphabetItems // [{name "123"}]
Expanding on this topic we can extend Character to add a isAsciiLetter property:
extension Character {
var isAsciiLetter: Bool { "A"..."Z" ~= self || "a"..."z" ~= self }
}
This would allow to extend StringProtocol to check is a string starts with an ascii letter:
extension StringProtocol {
var startsWithAsciiLetter: Bool { first?.isAsciiLetter == true }
}
And just a helper to negate a boolean property:
extension Bool {
var negated: Bool { !self }
}
Now we can filter the items collection as follow:
let nonAlphabetItems = items.filter(\.name.startsWithAsciiLetter.negated) // [{name "123"}]
If you need an occasional filter, you could simply write a condition combining standard predicates isLetter and isASCII which are already defined for Character. It's as simple as:
let items = [ "Abc", "01bc", "Ça va", "", " ", "𓀫𓀫𓀫𓀫"]
let nonAlphabetItems = items.filter { $0.isEmpty || !$0.first!.isASCII || !$0.first!.isLetter }
print (nonAlphabetItems) // -> Output: ["01bc", "Ça va", "", " ", "𓀫𓀫𓀫𓀫"]
If the string is not empty, it has for sure a first character $0.first!. It is tempting to use isLetter , but it appears to be true for many characters in many local alphabets, including for example the antique Egyptian hieroglyphs like "𓀫" or the French alphabet with "Ç"and accented characters. This is why you need to restrict it to ASCII letters, to limit yourself to the roman alphabet.
You can use NSCharacterSet in the following way :
let phrase = "Test case"
let range = phrase.rangeOfCharacter(from: characterSet)
// range will be nil if no letters is found
if let test = range {
println("letters found")
}
else {
println("letters not found")
}```
You can deal with ascii value
extension String {
var fisrtCharacterIsAlphabet: Bool {
guard let firstChar = self.first else { return false }
let unicode = String(firstChar).unicodeScalars
let ascii = Int(unicode[unicode.startIndex].value)
return (ascii >= 65 && ascii <= 90) || (ascii >= 97 && ascii <= 122)
}
}
var isAlphabet = "Hello".fisrtCharacterIsAlphabet
The Character type has a property for this:
let x: Character = "x"
x.isLetter // true for letters, false for punctuation, numbers, whitespace, ...
Note that this will include characters from other alphabets (Greek, Cyrillic, Chinese, ...).
As String is a Sequence with Element equal to Character, we can use the .first property to get the first char.
With this, you can filter your items:
let filtered = items.filter { $0.name.first?.isLetter ?? false }
You can get this done through this simple String extension
extension StringProtocol {
var isFirstCharacterAlp: Bool {
first?.isASCII == true && first?.isLetter == true
}
}
Usage:
print ("H1".isFirstCharacterAlp)
print ("ابراهيم1".isFirstCharacterAlp)
Output
true
false
Happy Coding!
Reference

How to convert sequence of ASCII code into string in swift 4?

I have an sequence of ASCII codes in string format like (7297112112121326610511411610410097121). How to convert this into text format.
I tried below code :
func convertAscii(asciiStr: String) {
var asciiString = ""
for asciiChar in asciiStr {
if let number = UInt8(asciiChar, radix: 2) { // Cannot invoke initializer for type 'UInt8' with an argument list of type '(Character, radix: Int)'
print(number)
let character = String(describing: UnicodeScalar(number))
asciiString.append(character)
}
}
}
convertAscii(asciiStr: "7297112112121326610511411610410097121")
But getting error in if let number line.
As already mentioned decimal ASCII values are in range of 0-255 and can be more than 2 digits
Based on Sulthan's answer and assuming there are no characters < 32 (0x20) and > 199 (0xc7) in the text this approach checks the first character of the cropped string. If it's "1" the character is represented by 3 digits otherwise 2.
func convertAscii(asciiStr: String) {
var source = asciiStr
var result = ""
while source.count >= 2 {
let digitsPerCharacter = source.hasPrefix("1") ? 3 : 2
let charBytes = source.prefix(digitsPerCharacter)
source = String(source.dropFirst(digitsPerCharacter))
let number = Int(charBytes)!
let character = UnicodeScalar(number)!
result += String(character)
}
print(result) // "Happy Birthday"
}
convertAscii(asciiStr: "7297112112121326610511411610410097121")
If we consider the string to be composed of characters where every character is represented by 2 decimal letters, then something like this would work (this is just an example, not optimal).
func convertAscii(asciiStr: String) {
var source = asciiStr
var characters: [String] = []
let digitsPerCharacter = 2
while source.count >= digitsPerCharacter {
let charBytes = source.prefix(digitsPerCharacter)
source = String(source.dropFirst(digitsPerCharacter))
let number = Int(charBytes, radix: 10)!
let character = UnicodeScalar(number)!
characters.append(String(character))
}
let result: String = characters.joined()
print(result)
}
convertAscii(asciiStr: "7297112112121326610511411610410097121")
However, the format itself is ambigious because ASCII characters can take from 1 to 3 decimal digits, therefore to parse correctly, you need all characters to have the same length (e.g. 1 should be 001).
Note that I am taking always the same number of letters, then convert them to a number and then create a character the number.

How can I insert a space character in every upper case letter expect the first one at each element of string array in DXL script?

I would like to edit the elements of string array with DXL script which is used in for loop. The problem will be described in the following:
I would like to insert space in front of every upper case letter expect the first one and it would be applied for all lines in string array.
Example:
There is a string array:
AbcDefGhi
GhiDefAbc
DefGhiAbc
etc.
and finally I would like to see the result as:
Abc Def Ghi
Ghi Def Abc
Def Ghi Abc
etc.
Thanks in advance!
Derived straightly from the DXL manual..
Regexp upperChar = regexp2 "[A-Z]"
string s = "yoHelloUrban"
string sNew = ""
while (upperChar s) {
sNew = sNew s[ 0 : (start 0) - 1] " " s [match 0]
s = s[end 0 + 1:]
}
sNew = sNew s
print sNew
You might have to tweak around the fact that you do not want EVERY capital letter to be replaced with , only those that are not at the beginning of your string.
Here's a solution written as a function that you can just drop into your code. It processes an input string character by character. Always outputs the first character as-is, then inserts a space before any subsequent upper-case character.
For efficiency, if processing a large number of strings, or very large strings (or both!), the function could be modified to append to a buffer instead of a string, before finally returning a string.
string spaceOut(string sInput)
{
const int intA = 65 // DECIMAL 65 = ASCII 'A'
const int intZ = 90 // DECIMAL 90 = ASCII 'Z'
int intStrLength = length(sInput)
int iCharCounter = 0
string sReturn = ""
sReturn = sReturn sInput[0] ""
for (iCharCounter = 1; iCharCounter < intStrLength; iCharCounter++)
{
if ((intOf(sInput[iCharCounter]) >= intA)&&(intOf(sInput[iCharCounter]) <= intZ))
{
sReturn = sReturn " " sInput[iCharCounter] ""
}
else
{
sReturn = sReturn sInput[iCharCounter] ""
}
}
return(sReturn)
}
print(spaceOut("AbcDefGHi"))

How to capitalize each alternate character of a string?

Lets say there is a string "johngoestoschool" it should become "JoHnGoEsToScHoOl" and incase if there is a special character in between it should ignore it for example given string "jo$%##hn^goe!st#os&choo)l" answer should be "Jo$%##Hn^GoE!sT#oS&cHoO)l"
From this answer, we in order to iterate we can do:
let s = "alpha"
for i in s.characters.indices[s.startIndex..<s.endIndex]
{
print(s[i])
}
Why can't we print the value of "i" here?
When we do i.customPlaygroundQuickLook it types int 0 to int4.
So my idea is to
if (i.customPlaygroundQuickLook == 3) {
s.characters.currentindex = capitalized
}
Kindly help
This should solve your function, the hard part is just checking weather the character is letters or not, using inout and replace range would give better performance:
func altCaptalized(string: String) -> String {
var stringAr = string.characters.map({ String($0) }) // Convert string to characters array and mapped it to become array of single letter strings
var numOfLetters = 0
// Convert string to array of unicode scalar character to compare in CharacterSet
for (i,uni) in string.unicodeScalars.enumerated() {
//Check if the scalar character is in letter character set
if CharacterSet.letters.contains(uni) {
if numOfLetters % 2 == 0 {
stringAr[i] = stringAr[i].uppercased() //Replace lowercased letter with uppercased
}
numOfLetters += 1
}
}
return stringAr.joined() //Combine all the single letter strings in the array into one string
}

Resources