SwiftCsv Parser Support for DoubleQuotes - ios

Im using Swiftcsv library to parse CSV file.
How to ignore Comma delimiter for strings within DoubleQuotes e.g "Abcd, went to Apple" ?
here the parser takes Abcd as one Value and went to Apple as another value.
Code :
func parseRows(fromLines lines: [String]) -> [Dictionary<String, String>] {
var rows: [Dictionary<String, String>] = []
for (lineNumber, line) in enumerate(lines) {
if lineNumber == 0 {
continue
}
var row = Dictionary<String, String>()
let values = line.componentsSeparatedByCharactersInSet(self.delimiter)
for (index, header) in enumerate(self.headers) {
let value = values[index]
row[header] = value
}
rows.append(row)
}
return rows
}
How can i change line.componentsSeparatedByCharactersInSet(self.delimiter) to ignore commas within Doublequotes?

Use this instead https://github.com/Daniel1of1/CSwiftV
Had the same issue. It took me an hour or so to figure out the problem is that SwiftCSV doesn't, erm, work.
Text in CSV files is inside quotes so commas and newlines in a CSV don't screw up the parser. I looked at the SwiftCSV source and there is no support for that - meaning that any commas or newlines screw up the parsing.
You could patch up SwiftCSV, or just go with CSwiftV I linked above.

I don't know if you are still looking for a solution, but I just came up with a quick way to do this as it is a problem that just came up for me.
The code isn't full proof because I am only using it for a side project so if you want to handle more cases you probably will need to make some changes.
func parseRows(fromLines lines: [String]) -> [Dictionary<String, String>] {
var rows: [Dictionary<String, String>] = []
for (lineNumber, line) in enumerate(lines) {
if lineNumber == 0 {
continue
}
var row = Dictionary<String, String>()
// escape commas in the string when it is surrounded by quotes
let convertedLine = NSString(string: line) // have to convert string to NSString because string does not have all NSString API
var escapedLine = line
var searchRange = NSMakeRange(1,convertedLine.length)
var foundRange:NSRange
if NSString(string: line).containsString("\"")
{
while (searchRange.location < convertedLine.length) {
searchRange.length = convertedLine.length-searchRange.location
foundRange = convertedLine.rangeOfString("\"", options: nil, range: searchRange)
if (foundRange.location != NSNotFound) {
// found a quotation mark
searchRange.location = foundRange.location+foundRange.length
let movieTitle = convertedLine.substringToIndex(foundRange.location)
escapedLine = convertedLine.stringByReplacingOccurrencesOfString(",", withString: "&c", options: nil, range: NSMakeRange(0,foundRange.location))
} else {
// no more substring to find
break
}
}
}
var values = escapedLine.componentsSeparatedByCharactersInSet(self.delimiter)
for (index, header) in enumerate(self.headers) {
var value = values[index]
//reinsert commas if they were escaped and get rid of quotation marks
value = value.stringByReplacingOccurrencesOfString("\"", withString: "")
value = value.stringByReplacingOccurrencesOfString("&c", withString: ",")
row[header] = value
}
rows.append(row)
}
return rows
}

Related

how to split a string with Umlaut like "ä" in swift

I'm programming in swift 5 a search routine and I want to highlight in a string if the search is contained in this string (e.g. if I search for "bcd" in a string like "äbcdef" the result should look like "äbcdef". In doing so I wrote an extension for String to split a String into the substring before the match with the search string (="before") , the match with the search string (="match") and the substring afterwards (="after").
extension String {
func findSubstring(forSearchStr search: String, caseSensitive sensitive: Bool) -> (before: String, match: String, after: String) {
var before = self
var searchStr = search
if !sensitive {
before = before.lowercased()
searchStr = searchStr.lowercased()
}
var match = ""
var after = ""
let totalStringlength = before.count
let searchStringlength = searchStr.count
var startpos = self.startIndex
var endpos = self.endIndex
for id in 0 ... (totalStringlength - searchStringlength) {
startpos = self.index(self.startIndex, offsetBy: id)
endpos = self.index(startpos, offsetBy: searchStringlength)
if searchStr == String(before[startpos ..< endpos]) {
before = String(self[self.startIndex ..< startpos])
match = String(self[startpos ..< endpos])
if id < totalStringlength - searchStringlength - 1 {
startpos = self.index(startpos, offsetBy: searchStringlength)
after = String(self[startpos ..< self.endIndex])
}
break
}
}
return (before, match, after)
} // end findSubstring()
}
My problem is, that this routine works well for all strings without special characters like the German Umlaute "ä, ö, ü" or "ß". If a string contains one of these characters the returned substrings "match" and "after" are shifted one sign to the right. In the example above the result for the search "bcd" is in this case "äbcdef"
My question is, what do I have to do to handle this characters properly as well?
By the way: is there a simplier solution than mine to split a string as described than what I have programmed (which seems to me to be rather complex :) )
Thanks for your valuable support
String comparison is a complicated issue, and is something you would not want to handle yourself unless you are studying this.
Just use String.range(of:options:):
extension String {
func findSubstring(forSearchStr search: String, caseSensitive sensitive: Bool) -> (before: String, match: String, after: String)? {
if let substringRange = range(of: search, options: sensitive ? [] : [.caseInsensitive], locale: nil) {
return (String(self[startIndex..<substringRange.lowerBound]),
String(self[substringRange]),
String(self[substringRange.upperBound..<self.endIndex]))
} else {
return nil
}
}
}
// (before: "ä", match: "bcd", after: "e")
print("äbcde".findSubstring(forSearchStr: "bcd", caseSensitive: true)!)
Note that this is not a literal comparison. For example:
// prints (before: "", match: "ß", after: "")
print("ß".findSubstring(forSearchStr: "ss", caseSensitive: false)!)
If you want a literal comparison, use the literal option:
range(of: search, options: sensitive ? [.literal] : [.caseInsensitive, .literal])

Get specific values from a string and create array

From some URL I create an array of strings, and I would like to grab some data from those strings and turn them into another array of variables.
My array of strings looks like this:
#EXTINF:-1 tvg-logo="https://www.thetvdb.com/banners/posters/248741-9.jpg" group-title="Broke Girls", trailer
#EXTINF:-1 tvg-logo="https://www.thetvdb.com/banners/posters/210841-10.jpg" group-title="Alphas", Alphas trailer
#EXTINF:-1 tvg-logo="https://www.thetvdb.com/banners/posters/309053-2.jpg" group-title="American Gothic", trailer
Every line represents a new string item from my array.
I am trying to create a function to do it, but until now, I only have this:
func grabValuesFromUrl(savedUrl: String) {
var trailersArray = []()
if let url = URL(string: savedUrl) {
do {
let contents = try String(contentsOf: url)
contents.enumerateLines { (line, stop) in
// here i need to grab the values from every string inside tvg-logo="", group-title="", and the last one after "," that's the title, and put them into trailersArray[], afterwards i will make some model class to get the data like trailersArray.logo and trailersArray.group and trailersArray.title
}
} else {
print("no url added")
}
}
Thanks in advance
I'd use regex for anything related to extracting data from a string with known format. For this, lets first define helper function:
func matches(for regex: String, inText text: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: regex, options: [.caseInsensitive]) else { return [] }
let nsString = text as NSString
let results = regex.matches(in: text, options: [], range: NSMakeRange(0, nsString.length))
return results.flatMap { result in
(0..<result.numberOfRanges).map {
result.range(at: $0).location != NSNotFound ? nsString.substring(with: result.range(at: $0)) : ""
}
}
}
And then define the regular expression that will extract required data:
let regex = "^.*tvg-logo=\"(.+)\".*group-title=\"(.+)\".*, (.+)$"
Beware that this regex is sensitive to data format so you'll have to adapt it to new one in case of changes.
Finally, in your line enumeration closure you can extract the data:
let parts = matches(for: regex, inText: line).dropFirst()
parts is now an array with three corresponding items (we drop the first one because it is the line itself) if the line matches the regex, so we can, for example, append a tuple with values to the array:
if parts.count == 3 {
trailersArray.append((logo: parts[0], group: parts[1], title: parts[2]))
}

Use of rangeOfCharacter in Swift 3.0

I am attempting to use rangeOfCharacter to create an app, but am unable to understand its documentation:
func rangeOfCharacter(from: CharacterSet, options:
String.CompareOptions, range: Range<String.Index>?)
-Finds and returns the range in the String of the first character from
a given character set found in a given range with given options.
Documentation link: https://developer.apple.com/documentation/swift/string#symbols
I am working on an exercise to create a function which will take in a name and return the name, minus any consonants before the first vowel. The name should be returned unchanged if there are no consonants before the first vowel.
Below is the code I have so far:
func shortNameFromName(name: String) -> String {
var shortName = name.lowercased()
let vowels = "aeiou"
let vowelRange = CharacterSet(charactersIn: vowels)
rangeOfCharacter(from: vowels, options: shortName,
range: substring(from: shortName[0]))
Any help is much appreciated. Apologies for the newbie mistakes.
I hate Swift ranges. But hopefully things will get better with Swift 4.
let name = "Michael"
var shortName = name.lowercased()
let vowels = "aeiou"
let vowelSet = CharacterSet(charactersIn: vowels)
let stringSet = shortName
if let range = stringSet.rangeOfCharacter(from: vowelSet, options: String.CompareOptions.caseInsensitive)
{
let startIndex = range.lowerBound
let substring = name.substring(from: range.lowerBound)
print(substring)
}
Use this code with a regular expression your problem is solved
Improved
func shortNameFromName(name: String) -> String {
do{
let regex2 = try NSRegularExpression(pattern: "[a|e|i|o|u].*", options:[.dotMatchesLineSeparators])
if let result = regex2.firstMatch(in: name.lowercased(), options: .init(rawValue: 0), range: NSRange(location: 0, length: NSString(string: name).length))
{
return String(NSString(string: name).substring(with: result.range))
}
}
catch
{
debugPrint(error.localizedDescription)
}
return ""
}
Tested
debugPrint(self.shortNameFromName(name: "yhcasid")) //test1
debugPrint(self.shortNameFromName(name: "ayhcasid")) //test2
debugPrint(self.shortNameFromName(name: "😀abc")) // working thanks to #MartinR
Console Log
test1 result
"asid"
test2 result
"ayhcasid"
test3 result
"abc"
Hope this helps
You are passing completely wrong arguments to the method.
rangeOfCharacter accepts 3 arguments. You passed in the character set correctly, but the last two arguments you passed makes no sense. You should pass a bunch of options as the second argument, instead you passed in a string. The third argument is supposed to be a Range but you passed the return value of a substring call.
I think rangeOfCharacter isn't suitable here. There are lots more better ways to do this. For example:
func shortNameFromName(name: String) -> String {
return String(name.characters.drop(while: {!"aeiou".characters.contains($0)}))
}
Swift 3
replace your code here..
func shortNameFromName(name: String) -> String {
var shortName = name.lowercased()
let newstring = shortName
let vowels: [Character] = ["a","e","i","o","u"]
for i in shortName.lowercased().characters {
if vowels.contains(i) {
break
}
else {
shortName = shortName.replacingOccurrences(of: "\(i)", with: "")
}
}
if shortName != "" {
return shortName
}
else
{
return newstring
}

Remove repeating substring from string

I cannot think of the a function to remove a repeating substring from my string. My string looks like this:
"<bold><bold>Rutger</bold> Roger</bold> rented a <bold>testitem zero dollars</bold> from <bold>Rutger</bold>."
And if <bold> is followed by another <bold> I want to remove the second <bold>. When removing that second <bold> I also want to remove the first </bold> that follows.
So the output that I'm looking for should be this:
"<bold>Rutger Roger</bold> rented a <bold>testitem zero dollars</bold> from <bold>Rutger</bold>."
Anyone know how to achieve this in Swift (2.2)?
I wrote a solution using regex with the assumption that tags won't appear in nested contents more than 1 times. In other words it just cleans the double tags not more than that. You can use the same code and a recursive call to clean as many nested repeating tag as you want:
class Cleaner {
var tags:Array<String> = [];
init(tags:Array<String>) {
self.tags = tags;
}
func cleanString(html:String) -> String {
var res = html
do {
for tag in tags {
let start = "<\(tag)>"
let end = "</\(tag)>"
let pattern = "\(start)(.*?)\(end)"
let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpression.Options.caseInsensitive)
let matches = regex.matches(in: res, options: [], range: NSRange(location: 0, length: res.utf16.count))
var diff = 0;
for match in matches {
let outer_range = NSMakeRange(match.rangeAt(0).location - diff, match.rangeAt(0).length)
let inner_range = NSMakeRange(match.rangeAt(1).location - diff, match.rangeAt(1).length)
let node = (res as NSString).substring(with: outer_range)
let content = (res as NSString).substring(with: inner_range)
// look for the starting tag in the content of the node
if content.range(of: start) != nil {
res = (res as NSString).replacingCharacters(in: outer_range, with: content);
//for shifting future ranges
diff += (node.utf16.count - content.utf16.count)
}
}
}
}
catch {
print("regex was bad!")
}
return res
}
}
let cleaner = Cleaner(tags: ["bold"]);
let html = "<bold><bold>Rutger</bold> Roger</bold> rented a <bold><bold>testitem</bold> zero dollars</bold> from <bold>Rutger</bold>."
let cleaned = cleaner.cleanString(html: html)
print(cleaned)
//<bold>Rutger Roger</bold> rented a <bold>testitem zero dollars</bold> from <bold>Rutger</bold>.
Try this, i have just made. Hope this helpful.
class Test : NSObject {
static func removeFirstString (originString: String, removeString: String, withString: String) -> String {
var genString = originString
if originString.contains(removeString) {
let range = originString.range(of: removeString)
genString = genString.replacingOccurrences(of: removeString, with: withString, options: String.CompareOptions.anchored, range: range)
}
return genString
}
}
var newString = Test.removeFirstString(originString: str, removeString: "<bold>", withString: "")
newString = Test.removeFirstString(originString: newString, removeString: "</bold>", withString: "")

How can I remove or replace all punctuation characters from a String?

I have a string composed of words, some of which contain punctuation, which I would like to remove, but I have been unable to figure out how to do this.
For example if I have something like
var words = "Hello, this : is .. a string?"
I would like to be able to create an array with
"[Hello, this, is, a, string]"
My original thought was to use something like words.stringByTrimmingCharactersInSet() to remove any characters I didn't want but that would only take characters off the ends.
I thought maybe I could iterate through the string with something in the vein of
for letter in words {
if NSCharacterSet.punctuationCharacterSet.characterIsMember(letter){
//remove that character from the string
}
}
but I'm unsure how to remove the character from the string. I'm sure there are some problems with the way that if statement is set up, as well, but it shows my thought process.
Xcode 11.4 • Swift 5.2 or later
extension StringProtocol {
var words: [SubSequence] {
split(whereSeparator: \.isLetter.negation)
}
}
extension Bool {
var negation: Bool { !self }
}
let sentence = "Hello, this : is .. a string?"
let words = sentence.words // ["Hello", "this", "is", "a", "string"]
String has a enumerateSubstringsInRange() method.
With the .ByWords option, it detects word boundaries and
punctuation automatically:
Swift 3/4:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstrings(in: string.startIndex..<string.endIndex,
options: .byWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
Swift 2:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstringsInRange(string.characters.indices,
options: .ByWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
This works with Xcode 8.1, Swift 3:
First define general-purpose extension for filtering by CharacterSet:
extension String {
func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String
{
var filteredString = self
while true {
if let forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters) {
filteredString.removeSubrange(forbiddenCharRange)
}
else {
break
}
}
return filteredString
}
}
Then filter using punctuation:
let s:String = "Hello, world!"
s.removingCharacters(inCharacterSet: CharacterSet.punctuationCharacters) // => "Hello world"
let charactersToRemove = NSCharacterSet.punctuationCharacterSet().invertedSet
let aWord = "".join(words.componentsSeparatedByCharactersInSet(charactersToRemove))
An alternate way to filter characters from a set and obtain an array of words is by using the array's filter and reduce methods. It's not as compact as other answers, but it shows how the same result can be obtained in a different way.
First define an array of the characters to remove:
let charactersToRemove = Set(Array(".:?,"))
next convert the input string into an array of characters:
let arrayOfChars = Array(words)
Now we can use reduce to build a string, obtained by appending the elements from arrayOfChars, but skipping all the ones included in charactersToRemove:
let filteredString = arrayOfChars.reduce("") {
let str = String($1)
return $0 + (charactersToRemove.contains($1) ? "" : str)
}
This produces a string without the punctuation characters (as defined in charactersToRemove).
The last 2 steps:
split the string into an array of words, using the blank character as separator:
let arrayOfWords = filteredString.componentsSeparatedByString(" ")
last, remove all empty elements:
let finalArrayOfWords = arrayOfWords.filter { $0.isEmpty == false }
NSScaner way:
let words = "Hello, this : is .. a string?"
//
let scanner = NSScanner(string: words)
var wordArray:[String] = []
var word:NSString? = ""
while(!scanner.atEnd) {
var sr = scanner.scanCharactersFromSet(NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKMNOPQRSTUVWXYZ"), intoString: &word)
if !sr {
scanner.scanLocation++
continue
}
wordArray.append(String(word!))
}
println(wordArray)

Resources