What is the most efficient way to get the count of the unique values on a table?
For example:
fruitType
---------
banana
banana
apple
apple
apple
bananas : 2
apples : 3
By using fruitsCollection.distinct(by: ["fruitType"]) i can get the distinct values but not the count.
Any help would be appreciated.
you could try something simple like this (assuming fruits are Strings, adjust accordingly to your object types):
let fruits = fruitsCollection.distinct(by: ["fruitType"])
var results = [String:Int]()
Set(fruits).forEach{ fruit in results[fruit] = (fruits.filter{$0 == fruit}).count }
print("---> results: \(results)")
or
let results: [String:Int] = Set(fruits).reduce(into: [:]) { dict, next in
dict[next] = (fruits.filter{$0 == next}).count }
print("---> results: \(results)")
The Set(fruits) gives you the unique set of fruit names. The filter{...} gives you the count of each. The forEach or the reduce turns the results into a dictionary of key values.
#workingdog answer works very well but here's a more Realmy option. Something to keep in mind is that Realm Results objects are lazily loaded - meaning working with very large datasets has a low memory impact.
However, as soon high level Swift functions are used, that lazy-ness is lost and every object gobbles up memory.
For example, loading 50,000 Realm objects into a Results object doesn't have any significant memory impact - however, loading 50,000 objects into an Array could overwhelm the device as the objects loose their lazy-loading nature.
With this solution, we rely on Realm to present unique values and store them in Results (lazy!) then iterating over those we filter for matching objects (lazy) and return their count.
I created a FruitClass to hold the fruit types
class FruitClass: Object {
#Persisted var fruitType = ""
}
and then to code
This is a very memory friendly solution
//get the unique types of fruit. results are lazy!
let results = realm.objects(FruitClass.self).distinct(by: ["fruitType"])
//iterate over the results to get each fruit type, and then filter for those to get the count of each
for fruit in results {
let type = fruit.fruitType
let count = realm.objects(FruitClass.self).filter("fruitType == %#", type).count
print("\(type) has a count of \(count)")
}
and the results
apple has a count of 3
banana has a count of 2
orange has a count of 1
pear has a count of 1
Related
I am trying to load a dictionary with around 11,000 pairs into a Swift program (about .7MB). The values of the dictionary are arrays, and I need to be able to loop through the dictionary's arrays and compare those values with values in another array. There is an average of 10 items in the subarrays, although some have two or three, and some have hundreds.
The program crashes, and I am looking for a way to fix it while keeping the functionality from my Python-based prototype. I was thinking of using a bunch of property lists, and loading them into memory one by one, but this seems inefficient. What can I do to redesign this?
Some of the code which is being problematic:var dictOfHashesAndMaterials:[String:[Int]]: = ["10928123sdfb234w3fw3": [123,435,1,573,456,3234,653,57856,434,345],"13435fw323df0239dfsdf":[435,756,978,231,5668,485,9678,1233,87,989]] // and 11,000 additional entries which look similar to this.
And here's where I use it:` // function to populate the list of ingredients.
func populateListOfRecipes(){
var storedIngredients = UserDefaults.standard.object(forKey: "ingredients") as! [Int]
for(key, value) in dictOfHashesAndMaterials{
let listSet = Set(storedMaterials)
let findListSet = Set(value)
if findListSet.isSubset(of: listSet){
print("found a match!")
arrayOfAvailableProjects.append(key)
}
}
tableView.reloadData()
}`
A tableView is then populated by cells which link to the specifics of a given project. The details are filled in when the hashes are sent to the server, and the corresponding instructions are returned in full to the app.
Is it crashing in the simulator? I've worked with 30 GB files using the simulator without a crash. On the device is a different story...
Anyway, try this:
for(key, value) in dictOfHashesAndMaterials{
autoreleasepool {
let listSet = Set(storedMaterials)
let findListSet = Set(value)
if findListSet.isSubset(of: listSet){
print("found a match!")
arrayOfAvailableProjects.append(key)
}
}
}
tableView.reloadData()
}
By draining the auto release pool every iteration you will prevent running out of available memory.
I've got some [NSDictionary] filled with books that I'm checking values for to better display the content in a UICollectionView.
I'm doing a check on a key if it contains several ISBN numbers. If it do I'll want to display them as separate books. The key is 'isbn' and if the value are something like '["9788252586336", " 9788203360510"]' I'll want to display them as separate books. So they'll become two "equal" books, but with different isbn numbers.
Here's what happens:
Heres the code
func parserFinishedSuccesfully(data: NSMutableArray){
self.tempArray = []
if !data.isEqual(nil) && data.count > 0
{
for i in (0 ..< data.count)
{
var currentBook = NSDictionary()
if !((data[i] as? NSDictionary)!.valueForKey("isbn") as? String)!.isEmpty
{
//If the book do have isbn then I'll want to display it
currentBook = data[i] as! NSDictionary
if let isbn = currentBook.valueForKey("isbn") as? String
{ //isbn could be "9788252586336; 9788203360510"
let isbnNumbersRecieved = isbnNumbers(isbn)
//gives: ["9788252586336", " 9788203360510"]
if isbnNumbersRecieved.count >= 2
{
//We are dealing with several isbn numbers on one book
for isbn in isbnNumbersRecieved
{ //The problem lies here!
print("This is the isbn: \(isbn)")
currentBook.setValue(isbn, forKey: "isbn")
self.tempArray.append(currentBook)
}
}
else
{
//Only one isbn
//currentBook.setValue(isbnNumbersRecieved.first, forKey: "isbn")
//self.tempArray.append(currentBook)
}
}else{
//No isbn, no want
print("No isbn")
}
}
}
print(self.tempArray)
self.collectionView.reloadData()
}
In code it says: "problem lies here". Well...the problem is there.
When it enters that loop with this array ["9788252586336", " 9788203360510"] it will first print the first number, then the second as it should.
Then when it takes currentBook. I change the value that is there before which would be "9788252586336; 9788203360510" with just the first isbn number "9788252586336". And then I take the entire currentBook and append it to the array (tempArray) that will be used to display the books in a UICollectionView later.
When the second iteration starts the isbn number will be "9788203360510" and I'll set the isbn number to be that in currentBook, just like the first iteration.
The weird thing happens now. When currentBook get's appended to (tempArray) it should contain the same book, but with different isbn numbers. But it actually contains the same fu#%$ book with same isbn number.
Somehow the second iteration adds 2 books with the same isbn number and the first book that should be there is gone...
Picture 1 is after the first iteration and picture 2 is after the second iteration. If you look at the key "isbn", you can see that there is something messed up with the code...
Picture 1
Picture 2
How can this happen and how would I fix it?
This is happening because currentBook is a reference type, not a value type. You are adding two references to the same book, currentBook, to the array.
As a simple fix, you could copy() the dictionary before modifying it and adding it to the array. A more robust fix would be to make all of your models value types instead.
Further reading:
Swift Blog: Value and Reference Types
Ray Wenderlich: Reference vs. Value Types in Swift
You are using the same dictionary object for each item in your for loop and hence this result. Try below and let me know if it works.
//We are dealing with several isbn numbers on one book
for isbn in isbnNumbersRecieved
{ //The problem lies here!
print("This is the isbn: \(isbn)")
let tempDict = NSDictionary()
tempDict.setValue(isbn, forKey: "isbn")
//currentBook.setValue(isbn, forKey: "isbn")
self.tempArray.append(tempDict)
}
Right now, my app retrieves around 1000 CKRecords which gets put into an array. I need to loop through this array later on. When I just had about 100 records to loop through my app worked smoothly and fast, but now it gets slower and slower as the number of records grow. I plan to have around 3000 CKRecords to retrieve so this could be a serious problem for me.
This is what I have come up with for now:
//movies == The NSMutableArray
if let recordMovies = movies.objectEnumerator().allObjects as? [CKRecord] {
for movie in recordMovies {
//Looping
print(movie.objectForKey("Title") as? String)
{
}
But this doesn´t help a lot, seeing as I have to turn the NSMutableArray into an Array of CKRecords. I do this because I have to access the CKRecords attributes when looping, such as movieTitle. What measures can I take to speed up the looping process
Iterating through 1000 (or 5000) elements of an array should take a tiny fraction of a second.
Printing to the debug console is very slow however. Get rid if your print statement.
If movies is an array of CKRecord objects, why can't you just cast it to the desired array type?
let start = NSDate.timeIntervalSinceReferenceDate()
if let recordMovies = movies as? [CKRecord] {
for movie in recordMovies {
//Do whatever work you need to do on each object, but DON'T use print
{
}
let elapsed = NSDate.timeIntervalSinceReferenceDate() - start
print( "iterating through the loop took \(elapsed) seconds")
You can save a lot of time by making "movies" not just an NSMutableArray, but an NSMutableArray of CKRecords, which makes it unnecessary to convert the array from NSMutableArray to a Swift array of CKRecord.
In Objective-C: Instead of
NSMutableArray* movies;
you use
NSMutableArray <CKRecord*> *movies;
As a result, movies [1] for example will have type CKRecord*, not id. An assignment movies [1] = something; requires that "something" is of type CKRecord* or id, nothing else.
I'm trying to update the user's music listen history by checking to see if any MPMediaItemPropertyPlayCounts have changed since the last check. Right now, upon each check, I'm querying the entire iPod library and comparing their existing play counts already in Core Data. If current play count != play count stored in Core Data, we do something with that song, since we know the user has listened to it recently.
In my code, I'm struggling to loop through all iPod songs and search for corresponding Core Data objects simultaneously. The code below prints out way too many lines. How do I search for objects in Core Data in a for loop?
class func checkiPodSongsForUpdate() {
var appDel: AppDelegate = UIApplication.sharedApplication().delegate as! AppDelegate
var context: NSManagedObjectContext = appDel.managedObjectContext!
var newSong = NSEntityDescription.insertNewObjectForEntityForName("IPodSongs", inManagedObjectContext: context) as! NSManagedObject
var request = NSFetchRequest(entityName: "IPodSongs")
request.returnsObjectsAsFaults = false
var results = context.executeFetchRequest(request, error: nil)
let query = MPMediaQuery.songsQuery()
let queryResult = query.collections as! [MPMediaItemCollection]
for song in queryResult {
for song in song.items as! [MPMediaItem] {
request.predicate = NSPredicate(format: "title = %#", "\(song.valueForProperty(MPMediaItemPropertyTitle))")
for result: AnyObject in results! {
if let playCount = result.valueForKey("playCount") as? String {
if playCount != "\(song.valueForProperty(MPMediaItemPropertyPlayCount))" {
println("\(song.valueForProperty(MPMediaItemPropertyTitle)) has a new play count.")
} else {
println("User hasn't listened to this song since last check.")
}
}
}
}
}
}
The immediate problem is that you assign a value to request.predicate, but doing so has no effect. A fetch request predicate only has effect if you assign it before doing the fetch. It becomes part of the fetch, and the fetch results only include objects that match the predicate.
There are a couple of possibilities for fixing this specific problem:
Do your Core Data fetch in the loop after assigning the predicate, inside the loop. This would get the results you expect. However it will also be extremely inefficient and won't scale well. Fetch requests are relatively expensive operations, so you'll probably find that you're spending a lot of time doing them.
Instead of assigning a value to request.predicate, filter the results array in memory by doing something like
let predicate = NSPredicate(format: "title = %#", "\(song.valueForProperty(MPMediaItemPropertyTitle))")
let currentResults = results.filteredArrayUsingPredicate(predicate)
That will get you an array of only the songs matching the predicate. This will be faster than option #1, but it still leaves the major problem that you're fetching all of the songs here. This potentially means loading a ton of managed objects into memory every time you call this function (suppose the user has, say, 50,000 songs?). It's still not a good solution. Unfortunately I don't know enough about MPMediaQuery to recommend a better approach.
Either of these should fix the immediate problem, but you need to address the more significant conceptual issue. You're comparing two potentially large collections looking for a single match. That's going to be a lot of CPU time or memory or both no matter how you approach it.
Found my own solution. Looping through two arrays can get expensive with enough objects in each array, just as #Tom said. I decided to loop through arrayOne and inside that loop, call a custom function that looks for the given key/value pair in arrayTwo.
// Custom function that returns Bool if arrayTwo contains a given key/value pair from inside the arrayOne loop
func arrayContains(array:[[String:String]], #key: String, #value: String) -> Bool {
for song in array {
if song[key] == value {
return true
}
}
return false
}
Big thanks to #vacawama for the help here: Find key-value pair in an array of dictionaries
I have an array of CKRecords that I would like to sort into a three dimensional array. Within the first array is an array of dates, and each date is an array of names, where a name is an Int between 0 and 4. I'm successfully sorting my records into a two dimensional array currently (code below).
Name can be retrieved with record.objectForKey("Name") as Int
func buildIndex(records: [CKRecord]) -> [[CKRecord]] {
var dates = [NSDate]()
var result = [[[CKRecord]]]()
for record in records {
var date = record.objectForKey("startTime") as NSDate
if !contains(dates, date) {
dates.append(date)
}
}
for date in dates {
var recordForDate = [CKRecord]()
for (index, exercise) in enumerate(records) {
let created = exercise.objectForKey("startTime") as NSDate
if date == created {
let record = records[index] as CKRecord
recordForDate.append(record)
}
}
result.append(recordForDate)
}
return result
}
Not sure the best way to approach this problem. Even general guidance would be appreciated.
General Overview:
Step 1 - choose your sort algorithm. I find that the insertion sort algorithm is the easiest for me to understand and is fast too.
Step 2 - decide on your data structure. You could use a 2-dimensional array. The first dimension represents your dates, and the second dimension represents your records. So the array might be defined like this List<List<CKRecord>>. So the first entry would contain a list (List<CKRecord>) of all the records with the earliest date (it may be one or many).
Basic Steps
(with a 2-D array)
So start with the empty data structure
Figure out which Date list it should go into
If the date does not exist yet, you need to sort the date into the correct position and add a new array/list with the new entry as the only contents
If the date already exists, you need to sort the record into the correct position of the already existing list of records
Enjoy