Parse CSV file in swift - ios

I am parsing data from csv file to dictionary with the help of github.
After parsing I am getting this type of dictionary :-
{
"" = "";
"\"barred_date\"" = "\"\"";
"\"company_id\"" = "\"1\"";
"\"company_name\"" = "\"\"";
"\"contact_no\"" = "\"1234567890\"";
"\"created_date\"" = "\"2015-06-01 12:43:11\"";
"\"current_project\"" = "\"111\"";
"\"designation\"" = "\"Developer\"";
"\"doj\"" = "\"2015-06-01 00:00:00\"";
"\"fin_no\"" = "\"ABC001\"";
"\"first_name\"" = "\"sssd\"";
"\"last_name\"" = "\"dd\"";
"\"project_name\"" = "\"Project 1\"";
"\"qr_code\"" = "\"12345678\"";
"\"resignation_date\"" = "\"\"";
"\"status\"" = "\"1\"";
"\"work_permit_no\"" = "\"ssdda11\"";
"\"worker_id\"" = "\"1\"";
"\"worker_image\"" = "\"assets/uploads/workers/eb49364ca5c5d22f11db2e3c84ebfce6.jpeg\"";
"\"worker_image_thumb\"" = "\"assets/uploads/workers/thumbs/eb49364ca5c5d22f11db2e3c84ebfce6.jpeg\"";}
How can I convert this to simple dictionary. I need data like this "company_id" = "1"
Thanks

I recommend using CSVImporter – it takes care of things like quoted text (following RFC 4180) for you and even handles very large files without problems.
Compared to other solutions it works both asynchronously (prevents delays) and reads your CSV file line by line instead of loading the entire String into memory (prevents memory issues). On top of that it is easy to use and provides beautiful callbacks for indicating failure, progress, completion and even data mapping if you desire to.
You can use it like this to get an array of Strings per line:
let path = "path/to/your/CSV/file"
let importer = CSVImporter<[String]>(path: path)
importer.startImportingRecords { $0 }.onFinish { importedRecords in
for record in importedRecords {
// record is of type [String] and contains all data in a line
}
}
Take advantage of more sophisticated features like header structure support like this:
// given this CSV file content
firstName,lastName
Harry,Potter
Hermione,Granger
Ron,Weasley
// you can import data in Dictionary format
let path = "path/to/Hogwarts/students"
let importer = CSVImporter<[String: String]>(path: path)
importer.startImportingRecords(structure: { (headerValues) -> Void in
// use the header values CSVImporter has found if needed
print(headerValues) // => ["firstName", "lastName"]
}) { $0 }.onFinish { importedRecords in
for record in importedRecords {
// a record is now a Dictionary with the header values as keys
print(record) // => e.g. ["firstName": "Harry", "lastName": "Potter"]
print(record["firstName"]) // prints "Harry" on first, "Hermione" on second run
print(record["lastName"]) // prints "Potter" on first, "Granger" on second run
}
}

Use the CSwiftV parser instead: https://github.com/Daniel1of1/CSwiftV
It actually handles quoted text, and therefore it handles both line breaks and commas in text. SwiftCSV cost me some time in that it doesn't handle that. But I did learn about the CSV format and parsing it ;)

Parse CSV to two-dimension array of Strings (rows and columns)
func parseCsv(_ data: String) -> [[String]] {
// data: String = contents of a CSV file.
// Returns: [[String]] = two-dimension array [rows][columns].
// Data minimum two characters or fail.
if data.count < 2 {
return []
}
var a: [String] = [] // Array of columns.
var index: String.Index = data.startIndex
let maxIndex: String.Index = data.index(before: data.endIndex)
var q: Bool = false // "Are we in quotes?"
var result: [[String]] = []
var v: String = "" // Column value.
while index < data.endIndex {
if q { // In quotes.
if (data[index] == "\"") {
// Found quote; look ahead for another.
if index < maxIndex && data[data.index(after: index)] == "\"" {
// Found another quote means escaped.
// Increment and add to column value.
data.formIndex(after: &index)
v += String(data[index])
} else {
// Next character not a quote; last quote not escaped.
q = !q // Toggle "Are we in quotes?"
}
} else {
// Add character to column value.
v += String(data[index])
}
} else { // Not in quotes.
if data[index] == "\"" {
// Found quote.
q = !q // Toggle "Are we in quotes?"
} else if data[index] == "\r" || data[index] == "\r\n" {
// Reached end of line.
// Column and row complete.
a.append(v)
v = ""
result.append(a)
a = []
} else if data[index] == "," {
// Found comma; column complete.
a.append(v)
v = ""
} else {
// Add character to column value.
v += String(data[index])
}
}
if index == maxIndex {
// Reached end of data; flush.
if v.count > 0 || data[data.index(before: index)] == "," {
a.append(v)
}
if a.count > 0 {
result.append(a)
}
break
}
data.formIndex(after: &index) // Increment.
}
return result
}
Call above with the CSV data
let dataArray: [[String]] = parseCsv(yourStringOfCsvData)
Then extract the header row
let dataHeader = dataArray.removeFirst()
I assume you want an array of dictionaries (most spreadsheet data includes mulitple rows, not just one). The next loop is for that. But if you only need a single row (and header for keys) into a single dictionary, you can study below and get the idea of how to get there.
var da: [Dictionary<String, String>] = [] // Array of dictionaries.
for row in dataArray {
for (index, column) in row.enumerated() {
var d: Dictionary<String, String> = Dictionary()
d.updateValue(column, forKey: dataHeader[index])
da.append(d)
}
}

Related

Get elements separate from array

What is the best way or elegant way to get all elements whithout spaces and ignore "DOMICILIO" and the next elements for example :
"LOAIZA"
"HERRERA"
"JESUS" (This is my expected output)
In this case I have one string with 2 elements ("LOAIZA\nHERRERA")
["LOAIZA HERRERA", "JESUS", "DOMICILIO", "CALLE1", "CALLE2"]
var dataID = ["LOAIZA HERRERA", "JESUS", "DOMICILIO"]
for i in dataID {
if i.contains(" "){
print(i) // LOAIZA HERRERA
let dataSeparate = i.components(separatedBy: " ")
print(dataSeparate) // ["LOAIZA", "HERRERA"]
}
}
Separate the terms by " " and flatMap into a new array. Then, find the index of "DOMICILIO" if it exists and use the segment of the array up to that point.
func findResult(dataID: [String]) -> Array<String> {
let terms = dataID.flatMap { $0.components(separatedBy: " ")}
let indexOfDom = terms.firstIndex(of: "DOMICILIO")
if let indexOfDom = indexOfDom, indexOfDom > 0 {
return Array(terms[0...(indexOfDom - 1)])
} else {
return terms
}
}

Comparing quantities of unique elements contained in large data sets [duplicate]

This question already has answers here:
Refactored Solution In Swift
(2 answers)
Closed 6 years ago.
I'm attempting to solve HackerRank's Hash Table Ransom Note challenge. There are 19 test cases and I'm passing all but two of time due to timeout on larger data sets (10,000-30,000 entries).
I'm given:
1) an array of words contained in a magazine and
2) an array of words for a ransom note. My objective is to determine if the words in the magazine can be used to construct a ransom note.
I need to have enough unique elements in the magazineWords to satisfy the quantity needed by noteWords.
I'm using this code to make that determination...and it takes FOREVER...
for word in noteWordsSet {
// check if there are enough unique words in magazineWords to put in the note
if magazineWords.filter({$0==word}).count < noteWords.filter({$0==word}).count {
return "No"
}
}
What is a faster way to accomplish this task?
Below is my complete code for the challenge:
import Foundation
var magazineWords = // Array of 1 to 30,000 strings
var noteWords = // Array of 1 to 30,000 strings
enum RegexString: String {
// Letters a to z, A to Z, 1 to 5 characters long
case wordCanBeUsed = "([a-zA-Z]{1,5})"
}
func matches(for regexString: String, in text: String) -> [String] {
// Hat tip MartinR for this
do {
let regex = try NSRegularExpression(pattern: regexString)
let nsString = text as NSString
let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length))
return results.map { nsString.substring(with: $0.range)}
} catch let error {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
func canCreateRansomNote(from magazineWords: [String], for noteWords: [String]) -> String {
// figure out what's unique
let magazineWordsSet = Set(magazineWords)
let noteWordsSet = Set(noteWords)
let intersectingValuesSet = magazineWordsSet.intersection(noteWordsSet)
// constraints specified in challenge
guard magazineWords.count >= 1, noteWords.count >= 1 else { return "No" }
guard magazineWords.count <= 30000, noteWords.count <= 30000 else { return "No" }
// make sure there are enough individual words to work with
guard magazineWordsSet.count >= noteWordsSet.count else { return "No" }
guard intersectingValuesSet.count == noteWordsSet.count else { return "No" }
// check if all the words can be used. assume the regex method works perfectly
guard noteWords.count == matches(for: RegexString.wordCanBeUsed.rawValue, in: noteWords.joined(separator: " ")).count else { return "No" }
// FIXME: this is a processor hog. I'm timing out when I get to this point
// need to make sure there are enough magazine words to write the note
// compare quantity of word in magazine with quantity of word in note
for word in noteWordsSet {
// check if there are enough unique words in magazineWords to put in the note
if magazineWords.filter({$0==word}).count < noteWords.filter({$0==word}).count {
return "No"
}
}
return "Yes"
}
print(canCreateRansomNote(from: magazineWords, for: noteWords))
I don't know how to read from the test case on the contest website or what frameworks you are allowed. If Foundation is allowed, you can use NSCountedSet
import Foundation
let fileContent = try! String(contentsOf: URL(fileURLWithPath: "/path/to/file.txt"))
let scanner = Scanner(string: fileContent)
var m = 0
var n = 0
scanner.scanInt(&m)
scanner.scanInt(&n)
var magazineWords = NSCountedSet(capacity: m)
var ransomWords = NSCountedSet(capacity: n)
for i in 0..<(m+n) {
var word: NSString? = nil
scanner.scanUpToCharacters(from: .whitespacesAndNewlines, into: &word)
if i < m {
magazineWords.add(word!)
} else {
ransomWords.add(word!)
}
}
var canCreate = true
for w in ransomWords {
if ransomWords.count(for: w) > magazineWords.count(for: w) {
canCreate = false
break
}
}
print(canCreate ? "Yes" : "No")
It works by going through the input file one word at a time, counting how many times that word appears in the magazine and in the ransom note. Then if any word appear more frequently in the ransom note than in the magazine, it fails the test immediately. Run the 30,000 words test case in less than 1 second on my iMac 2012.

Swift 3 For loop compare and change

Here is my code so far
var counter = 0
for i in 0...9 {
var val = NamePicker()
// array to find duplicates
var buttonValues = ["", "", "", "", "", "", "", "", "", ""] // array for button names
buttonValues.insert(val, at: counter)
print(buttonValues[counter])
counter += 1
}
This code is putting 10 string values into my array. What I would like to do is find a way to check each value in my array. for eample if my end result array is ["a","a","a","b","b","c","c","e","f","c"] I want to see if there is a triple of the same name(single and duplicates are fine). However if there is a triple I would like to change the 3rd value to another val from my NamePicker() function.
so with my array of
["a","a","a","b","b","c","c","e","f","c"]
there are 3 "a" and 3 "c", having two of the same is ok, i would like to change the 3rd to a new values and if the new value makes another triple it will change until there are no more triples.
so that array could possible have an end result of
["a","a","f","b","b","c","c","e","f","z"]
this is where the triples where changed.
Any help on how to do this efficiently?
Both options below asume that your NamePciker() function can generate at least 5 distinct values so there exists an array that satisfies your requirement.
Your requirement is better handled by not generating so many duplicates to begin with. If all you want is an array of names when each name cannot be repeated more than twice, try this:
var buttonValues = [String]()
var dict = [String: Int]()
while buttonValues.count < 10 {
let name = NamePicker()
let count = dict[name] ?? 0
guard count < 2 else { continue }
buttonValues.append(name)
dict[name] = count + 1
}
If you already have the array and want to correct it, do this:
var buttonValues = ["a","a","a","b","b","c","c","e","f","c"]
// Scan the array to tally how many times each name appears
var totalDict = [String: Int]()
buttonValues.forEach { totalDict[$0] = (totalDict[$0] ?? 0) + 1 }
// Now scan it again to update names that appear too many times
var runningDict = [String: Int]()
for (index, value) in buttonValues.enumerated() {
let count = runningDict[value] ?? 0
if count >= 2 {
while true {
let newValue = NamePicker()
let newTotal = (totalDict[newValue] ?? 0) + 1
if newTotal < 3 {
buttonValues[index] = newValue
totalDict[newValue] = newTotal
break
}
}
} else {
runningDict[value] = count + 1
}
}
Dictionary is the best way I think. Have the key be the character and the value be the count of that character. Your runtime will be O(n) since you only have to run through each input once. Here is an example:
let chars = ["a","a","a","b","b","c","c","e","f","c"]
var dict = [String: Int]()
for char in chars {
//If already in Dictionary, increase by one
if var count = dict[char] {
count += 1
dict[char] = count
} else {//else is not in the dictionary already, init with 1
dict[char] = 1
}
}
Output:
["b": 2, "e": 1, "a": 3, "f": 1, "c": 3]
Now I'm not sure how you want to replace the value that's the same character for a third time, but this is probably the best way to group the strings to determine which are over the limit.
Instead of inserting the wrong value and then checking if the values are correct, I would suggest to automatically create the correct array.
//array for button names
var buttonValues = Array<String>()
//tracks what value has been inserted how many times
var trackerDict = [String: Int]()
for i in 0...9 {
//we initialize a new variable that tells us if we found a valid value (if the value has not been inserted 2 times already)
var foundValidValue = false
while !foundValidValue{
var val = NamePicker()
//now we check if the value exists and if it is inserted less than 2 times
if let count = trackerDict[val] {
if count < 2 {
foundValidValue = true
}
}
//if we found the value, we can add it
if foundValidValue {
trackerDict[val] = (trackerDict[val] ?? 0) + 1
buttonValues.append(val)
}
//if we did not find it, we just run through the loop again
}
}
I added a dictionary because it is faster to keep track of the count in a dictionary than counting the number of occurrences in the array every time.

How to get all strings between particular delimiters?

I have a string called source. This string contains tags, marked with number signs (#) on left and right side.
What is the most efficient way to get tag names from the source string.
Source string:
let source = "Here is tag 1: ##TAG_1##, tag 2: ##TAG_2##."
Expected result:
["TAG_1", "TAG_2"]
Not a very short solution, but here you go:
let tags = source.componentsSeparatedByCharactersInSet(NSCharacterSet(charactersInString: " ,."))
.filter { (str) -> Bool in
return str.hasSuffix("##") && str.hasPrefix("##")
}
.map { (str) -> String in
return str.stringByReplacingOccurrencesOfString("##", withString: "")
}
Split the string at all occurences of ##:
let components = source.components(separatedBy: "##")
// Result: ["Here is tag 1: ", "TAG_1", ", tag 2: ", "TAG_2", "."]
Check that there's an odd number of components, otherwise there's an odd amount of ##s:
guard components.count % 2 == 1 else { fatalError("Unbalanced delimiters") }
Get every second element:
components.enumerated().filter{ $0.offset % 2 == 1 }.map{ $0.element }
In a single function:
import Foundation
func getTags(source: String, delimiter: String = "##") -> [String] {
let components = source.components(separatedBy: delimiter)
guard components.count % 2 == 1 else { fatalError("Unbalanced delimiters") }
return components.enumerated().filter{ $0.offset % 2 == 1 }.map{ $0.element }
}
getTags(source: "Here is tag 1: ##TAG_1##, tag 2: ##TAG_2##.") // ["TAG_1", "TAG_2"]
You can read this post and adapt the answer for your needs: Swift: Split a String into an array
If not you can also create your own method, remember a string is an array of characters, so you can use a loop to iterate through and check for a '#'
let strLength = source.characters.count;
var strEmpty = "";
for( var i=0; i < strLength; i++ )
{
if( source[ i ] == '#' )
{
var j=(i+2);
for( j; source[ (i+j) ] != '#'; j++ )
strEmpty += source[ (i+j) ]; // concatenate the characters to another variable using the += operator
i = j+2;
// do what you need to with the tag
}
}
I am more of a C++ programmer than a Swift programmer, so this is how I would approach it if I didn't want to use standard methods. There may be a better way of doing it, but I don't have any Swift knowledge.
Keep in mind if this does not compile then you may have to adapt the code slightly as I do not have a development environment I can test this in before posting.

Replace part of string with lower case letters - Swift

I have a Swift based iOS app and one of the features allows you to comment on a post. Anyway, users can add "#mentions" in their posts to tag other people. However I want to stop the user from adding a username with a capital letter.
Is there anyway I can convert a string, so that the #usernames are all in lowercase?
For example:
I really enjoy sightseeing with #uSerABC (not allowed)
I really enjoy sightseeing with #userabc (allowed)
I know there is a property for the string in swift called .lowercaseString - but the problem with that, is that it makes the entire string lowercase and thats not what I want. I only want the #username to be in lower case.
Is there any way around this with having to use the .lowercase property.
Thanks for your time, Dan.
This comes from a code I use to detect hashtags, I've modified to detect mentions:
func detectMentionsInText(text: String) -> [NSRange]? {
let mentionsDetector = try? NSRegularExpression(pattern: "#(\\w+)", options: NSRegularExpressionOptions.CaseInsensitive)
let results = mentionsDetector?.matchesInString(text, options: NSMatchingOptions.WithoutAnchoringBounds, range: NSMakeRange(0, text.utf16.count)).map { $0 }
return results?.map{$0.rangeAtIndex(0)}
}
It detects all the mentions in a string by using a regex and returns an NSRange array, by using a range you have the beginning and the end of the "mention" and you can easily replace them with a lower case version.
Split the string into two using the following command -
let arr = myString.componentsSeparatedByString("#")
//Convert arr[1] to lower case
//Append to arr[0]
//Enjoy
Thanks to everyone for their help. In the end I couldn't get any of the solutions to work and after a lot of testing, I came up with this solution:
func correctStringWithUsernames(inputString: String, completion: (correctString: String) -> Void) {
// Create the final string and get all
// the seperate strings from the data.
var finalString: String!
var commentSegments: NSArray!
commentSegments = inputString.componentsSeparatedByString(" ")
if (commentSegments.count > 0) {
for (var loop = 0; loop < commentSegments.count; loop++) {
// Check the username to ensure that there
// are no capital letters in the string.
let currentString = commentSegments[loop] as! String
let capitalLetterRegEx = ".*[A-Z]+.*"
let textData = NSPredicate(format:"SELF MATCHES %#", capitalLetterRegEx)
let capitalResult = textData.evaluateWithObject(currentString)
// Check if the current loop string
// is a #user mention string or not.
if (currentString.containsString("#")) {
// If we are in the first loop then set the
// string otherwise concatenate the string.
if (loop == 0) {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = currentString.lowercaseString
}
else {
// The username does not contain capital letters.
finalString = currentString
}
}
else {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = "\(finalString) \(currentString.lowercaseString)"
}
else {
// The username does not contain capital letters.
finalString = "\(finalString) \(currentString)"
}
}
}
else {
// The current string is NOT a #user mention
// so simply set or concatenate the finalString.
if (loop == 0) {
finalString = currentString
}
else {
finalString = "\(finalString) \(currentString)"
}
}
}
}
else {
// No issues pass back the string.
finalString = inputString
}
// Pass back the correct username string.
completion(correctString: finalString)
}
Its certainly not the most elegant or efficient solution around but it does work. If there are any ways of improving it, please leave a comment.

Resources