Swift advancedBy can't handle newline character "\r\n" [duplicate] - ios

This question already has answers here:
NSRange to Range<String.Index>
(16 answers)
Closed 7 years ago.
I ran into a very strange problem today with Swift 2.
I have this simple method to extract a substring based on NSRange:
func substringWithRange(string: String, range: NSRange) -> String {
let startIndex = string.startIndex.advancedBy(range.location)
let endIndex = startIndex.advancedBy(range.length)
let substringRange = Range<String.Index>(start: startIndex, end: endIndex)
return string.substringWithRange(substringRange)
}
With ordinary strings or strings containing unicode characters everything works fine. But one string contains the newline characters "\r\n" and suddenly
let startIndex = string.startIndex.advancedBy(range.location)
is always 1 greater than it should be.
let string = "<html>\r\n var info={};</html>"
let range = NSMakeRange(9, 12)
let substring = substringWithRange(string, range: range)
//Expected: var info={};
//Actual: ar info={};<
//string.startIndex = 0
//range.location = 9
//startIndex after advancedBy = 10
Does anyone know why advancedBy is acting that way and how I can solve this problem?

The reason is that Swift treats \r\n as one character
let cr = "\r"
cr.characters.count // 1
let lf = "\n"
lf.characters.count // 1
let crlf = "\r\n"
crlf.characters.count // 1

Related

Swift: Getting range of text that includes emojis [duplicate]

This question already has an answer here:
Swift Regex doesn't work
(1 answer)
Closed 5 years ago.
I'm trying to parse out "#mentions" from a user provided string. The regular expression itself seems to find them, but the range it provides is incorrect when emoji are present.
let text = "πŸ˜‚πŸ˜˜πŸ™‚ #joe "
let tagExpr = try? NSRegularExpression(pattern: "#\\S+")
tagExpr?.enumerateMatches(in: text, range: NSRange(location: 0, length: text.characters.count)) { tag, flags, pointer in
guard let tag = tag?.range else { return }
if let newRange = Range(tag, in: text) {
let replaced = text.replacingCharacters(in: newRange, with: "[email]")
print(replaced)
}
}
When running this
tag = (location: 7, length: 2)
And prints out
πŸ˜‚πŸ˜˜πŸ™‚ [email]oe
The expected result is
πŸ˜‚πŸ˜˜πŸ™‚ [email]
NSRegularExpression (and anything involving NSRange) operates on UTF16 counts / indexes. For that matter, NSString.count is the UTF16 count as well.
But in your code, you're telling NSRegularExpression to use a length of text.characters.count. This is the number of composed characters, not the UTF16 count. Your string "πŸ˜‚πŸ˜˜πŸ™‚ #joe " has 9 composed characters, but 12 UTF16 code units. So you're actually telling NSRegularExpression to only look at the first 9 UTF16 code units, which means it's ignoring the trailing "oe ".
The fix is to pass length: text.utf16.count.
let text = "πŸ˜‚πŸ˜˜πŸ™‚ #joe "
let tagExpr = try? NSRegularExpression(pattern: "#\\S+")
tagExpr?.enumerateMatches(in: text, range: NSRange(location: 0, length: text.utf16.count)) { tag, flags, pointer in
guard let tag = tag?.range else { return }
if let newRange = Range(tag, in: text) {
let replaced = text.replacingCharacters(in: newRange, with: "[email]")
print(replaced)
}
}

How to handle the %s format specifier

Objective-C code:
NSString *str = #"hi";
NSString *strDigit = #"1934"; (or #"193" may be a 3 digit or 4 digit value)
[dayText appendFormat:#"%#%4s,str,[strDigit UTF8String]];
The Objective-C code handles the output string with current alignment when it appears with 3 or 4 digits as output. It is correctly aligning to left and doesn't matter how much digits it is. Any one know how to handle this in Swift?
In Swift I tried with below code and the string is not adjusting the alignment according to the number of digits.
textForTrip += "\(str) \(String(format:"%4s", (strDigit.utf8))"
The %s format expects a pointer to a (NULL-terminated) C string
as argument, that can be obtained with the withCString method.
This would produce the same output as your Objective-C code:
let str = "Hi"
let strDigit = "193"
let text = strDigit.withCString {
String(format: "%#%4s", str, $0)
}
print(text)
It becomes easier if you store the number as integer instead of a
string:
let str = "Hi"
let number = 934
let text = String(format: "%#%4d", str, number)
print(text)
Try this below approach, that might help you
let strDigit = "\("1934".utf8)" //(or #"193" may be a 3 digit or 4 digit value)
var dayText = "Hello, good morning."
dayText += "\(strDigit.prefix(3))"

Using NSRegularExpression produces incorrect ranges when emoji are present [duplicate]

This question already has an answer here:
Swift Regex doesn't work
(1 answer)
Closed 5 years ago.
I'm trying to parse out "#mentions" from a user provided string. The regular expression itself seems to find them, but the range it provides is incorrect when emoji are present.
let text = "πŸ˜‚πŸ˜˜πŸ™‚ #joe "
let tagExpr = try? NSRegularExpression(pattern: "#\\S+")
tagExpr?.enumerateMatches(in: text, range: NSRange(location: 0, length: text.characters.count)) { tag, flags, pointer in
guard let tag = tag?.range else { return }
if let newRange = Range(tag, in: text) {
let replaced = text.replacingCharacters(in: newRange, with: "[email]")
print(replaced)
}
}
When running this
tag = (location: 7, length: 2)
And prints out
πŸ˜‚πŸ˜˜πŸ™‚ [email]oe
The expected result is
πŸ˜‚πŸ˜˜πŸ™‚ [email]
NSRegularExpression (and anything involving NSRange) operates on UTF16 counts / indexes. For that matter, NSString.count is the UTF16 count as well.
But in your code, you're telling NSRegularExpression to use a length of text.characters.count. This is the number of composed characters, not the UTF16 count. Your string "πŸ˜‚πŸ˜˜πŸ™‚ #joe " has 9 composed characters, but 12 UTF16 code units. So you're actually telling NSRegularExpression to only look at the first 9 UTF16 code units, which means it's ignoring the trailing "oe ".
The fix is to pass length: text.utf16.count.
let text = "πŸ˜‚πŸ˜˜πŸ™‚ #joe "
let tagExpr = try? NSRegularExpression(pattern: "#\\S+")
tagExpr?.enumerateMatches(in: text, range: NSRange(location: 0, length: text.utf16.count)) { tag, flags, pointer in
guard let tag = tag?.range else { return }
if let newRange = Range(tag, in: text) {
let replaced = text.replacingCharacters(in: newRange, with: "[email]")
print(replaced)
}
}

Swift - Whitespace count in a string [duplicate]

This question already has answers here:
Find number of spaces in a string in Swift
(3 answers)
Closed 5 years ago.
How do you get the count of the empty space within text?
It would be more helpful to me if explained with an example.
You can either use componentsSeparatedBy or filter function like
let array = string.components(separatedBy:" ")
let spaceCount = array.count - 1
or
let spaceCount = string.filter{$0 == " "}.count
If you want to consider other whitespace characters (not only space) use regular expression:
let string = "How to get count of the empty space in text,Like how we get character count like wise i need empty space count in a text, It would be more helpful if explained with an example."
let regex = try! NSRegularExpression(pattern: "\\s")
let numberOfWhitespaceCharacters = regex.numberOfMatches(in: string, range: NSRange(location: 0, length: string.utf16.count))
Regular expression \\s considers tab, cr, lf and space
Easiest way is to do something like this:
let emptySpacesCount = yourString.characters.filter { $0 == " " }.count
What this does is it takes characters from your string, filter out everything that is not space and then counts number of remaining elements.
You can try this example;
let string = "Whitespace count in a string swift"
let spaceCount = string.characters.filter{$0 == " "}.count

In Swift, how can I detect an "emoji tag"? [duplicate]

This question already has answers here:
Find out if Character in String is emoji?
(17 answers)
Closed 6 years ago.
func getEmojiTags(text: String) -> [String] {
}
An emoji tag is a combination of two parts, with no spaces between them. If there is a space between them, then it is not an emojiTag.
a character which is an emoji
a string which is not an emoji
For example:
Hello, my name is Jason πŸ˜€ 🐸 how are you ?-> []
Hello, my name is πŸ˜€Jason -> [πŸ˜€Jason]
Hello, my name is πŸ˜€ Jason -> []
I am going to the β›±beach with some πŸ™‰monkeys πŸ‘ -> [β›±beach, πŸ™‰monkeys]
Try using NSRegularExpression with emoji code ranges.
func emojiTags(str: String) -> [String] {
//A little bit simplified, you may need to define what are your "emoji".
//This is a subset defined in http://stackoverflow.com/a/36258684/6541007 .
let emojiCharacters = "\\U0001F600-\\U0001F64F\\U0001F300-\\U0001F5FF\\U0001F680-\\U0001F6FF\\u2600-\\u26FF"
let pattern = "[\(emojiCharacters)][^\(emojiCharacters)\\s]+"
let regex = try! NSRegularExpression(pattern: pattern, options: [])
let matches = regex.matchesInString(str, options: [], range: NSRange(0..<str.utf16.count))
return matches.map{(str as NSString).substringWithRange($0.range)}
}
let str1 = "Hello, my name is Jason πŸ˜€ 🐸 how are you ?"
print(emojiTags(str1)) //->[]
let str2 = "Hello, my name is πŸ˜€Jason"
print(emojiTags(str2)) //->["πŸ˜€Jason"]
let str3 = "Hello, my name is πŸ˜€ Jason"
print(emojiTags(str3)) //->[]
let str4 = "I am going to the β›±beach with some πŸ™‰monkeys πŸ‘"
print(emojiTags(str4)) //->["β›±beach", "πŸ™‰monkeys"]

Resources