So I browsed stack for the answer to my question, and everyone is saying that count requests are the way to go. I found that to be false when I ran unit tests on my app.
for number in largeNumber { //large number is 1000
let count = try self.context.count(for: countRequest)
}
operation took 0.2 seconds!!
whereas
for number in largeNumber { //large number is 1000
let fetch = try self.context.fetch(fetchRequest)
}
//operation took 0.158 seconds!
so whats everyone blabbering about count requests being more efficient. It makes it worse if anything. That said, is there a more efficient way of checking if a value exists in core data
The results of your tests may be due to a difference in what's being done in the two versions. Count returns the number of found managed objects. Fetch is likely to be populating the attributes, relationships etc.
Core Data likely caches this information so that an identical fetch request doesn't have to repeat. The result (in your example) could be equivalent of 1 fetch request and 999 no-operations.
Count should be the most efficient as it's not populating the managed objects. What happens if you loop 1000 times but have a different predicate each pass?
Related
My data model has a ClickerRecord entity with 2 attributes: date (NSDate) and numberOfBiscuits (NSNumber). Every time a new record is added, a different value for numberOfBiscuits can be entered.
To calculate a daily average for the number of biscuits I'm currently doing a fetch request for each day within range and using the corresponding NSExpression to calculate the sum of all numberOfBiscuits values for that day.
The problem: I'm using asynchronous fetch requests to avoid blocking the main thread, so it ends up being quite slow when there are many days between the first and last record. The fetch requests are performed one after another.
I could also load all records into memory and perform the sorting and calculations, but I'm worried that it could become an issue when the number of records becomes very large.
Therefore, my question: Is it possible to use NSExpressions to add something like sub-predicates for each date interval, in order to do a single fetch request and retrieve a dictionary with an entry for each daily sum of numberOfBiscuits?
If not, what would be the recommended approach for this situation?
I've read about subqueries but as far as I've understood they're not intended for this kind of use.
This is the first question I'm asking on SO, so I hope to have written it in a clear way :)
I think what you are looking for is the propertiesToGroupBy (see the Apple Docs) for the NSFetchRequest, though in your case it is not straight forward to implement, for reasons I will discuss later.
Suppose you could specify the category of biscuit consumed on each occasion, and this is stored in a category attribute of your entity. Then to obtain the total number of biscuits of each category (ignoring the date), you could use an NSExpression using #sum and specify:
fetch.propertiesToGroupBy = ["category"]
CoreData will then group the results of the fetch by the category and will calculate the sum for each group separately.
The problem in your case is that (unless you already strip out the time information from your date attribute), there is no attribute that represents the date interval that you want to group by, and CoreData will not let you specify a computed value to group by. You would need to add a new day attribute to your entity, and calculate that whenever you add/update a record, and specify it in the group by. And you face the same problem again if you subsequently want to calculate your average over a different interval - weeks or months for example. One other downside to this is that the results will only include days for which there are ClickerRecords: if the user has a day where they consume no biscuits, then the fetch will not show a result for that day (ie it will not infer an average of 0). You would need to handle this appropriately when using the results.
It might be better either to tune your asynchronous fetch or, as you suggest, just to read the whole lot into memory to perform the calculations. If your entity only has those two attributes, and assuming your users don't live entirely on biscuits, the volumes should not be too problematic.
I need to count no. of objects from a collection in core data of that satisfy a certain criteria.
(eg. count no. employees with distinct departments).
There are two solutions to my problem:
(1) Fetch the collection in only one request and filter the array locally
for each department using NSPredicate
(2) Execute multiple NSFetchedRequests directly on the data
Question is which solution will be fastest and take up least amount of memory given this is only for instrumentation purpose and is of no importance in the app in terms of behavior/UI.
Counter Question : If it is (1) - which is the best way to filter the array? manual looping and counting or NSPredicate?
P.S:
a. Names of departments are known to me. (its actually an enum)
b. collection is small - will be max 50
1 is fastest and takes most memory.
2 will use the least memory but may take longer.
However, this is not always true. In the event that your number of individual fetch requests will contain many of the same employee data sets that other fetch requests will return too, then it may even be the other way around. But as you are fetching for departments, that will not be the case.
For a small collection it may not be much of a difference anyway.
Counting question: This, too, depends. However, I'd go for the predicate as that is save for future use if the collection grows.
I understand that with parse there is a PFQuery limit where you can only retrieve 1000 objects at a time. I presume it doesn't, but does this also limit the number of whereKey comparisons that can be carried out. E.g.
var query = PFQuery(classname: "Photos")
query.whereKey("Name", equalTo: someString)
query.findObjectsInBackgroundWithBlock()
If there are more than 1000 objects in the class, will the whereKey comparison stop after it has compared 1000 objects, or is the issue with only actually retrieving more than 1000 objects?
The reason I presume there is not a limit on this, is that If you have more than 1000 users, there would be no straight forward way to do a standard user query.
Using the whereKey parameters does not effect your fetch limits, in fact, it reduces them simply due to the fact of its purpose. The point of including keys is to narrow it down correct? You can even include multiple keys or whereKey statements in the same query. So by narrowing it down further your reduce the probable objects to be fetched. So in short, your presumption is correct.
Lets be clear firstly, whereKey isn't actually doing anything, its setting a filter [parameter] and applying it to your asynchronous calls for the given blocks to do something with those keys. The findObjects is what returns your limit that you now know is 1000. You can skip queries See Here which effectively means you can query for the first 1000 and skip those you've already queried once your ready to display further results [pagination]. So to answer your second question, whereKey parameter won't stop doing anything, because it kind of isn't anyways, nor will you stop retrieving objects, you just have to learn how to navigate around your first 1000 returned objects.
There are numerous ways of querying users, it all depends on your apps direction and current set up. You have to think about Parse as a business and not a service, they make money off of API requests, so the more you do the better it is for them. I would suggest coming back to SO once you get to where you have an issue with this so someone can help you out, if you need it.
I'm performing a query using an sqlite db where I pull out a quite large data set of call records from a database. On the same page I want to show the breakdown of counts per day on the call records, so I perform about 30 count queries on the database.
Is there a way I can filter the set that I retrieve initially and perform the counts on the in memory set, so I don't have to run those continuous queries? I need those counts for graphing and display purposes but even with an index on date, it takes about 10 seconds to run the initial query plus all of the count queries.
What I'm basically asking is there a way to perform the counts on the records returned or perform analysis on it, or is there a smarter way to cache this data?
#set = Record.get_records_for_range(date1, date2)
while date1 < date2
#count = Record.count_records_for_date(date1)
date1 = date1 + 1
end
is basically what I'm doing. Surely there's a simpler and faster way?
Using #set.length will get you the count of the in memory set without querying the database because it is performed by ruby not active record (like .count is)
Read about it here https://batsov.com/articles/2014/02/17/the-elements-of-style-in-ruby-number-13-length-vs-size-vs-count/
Here is a quote pulled out of that article
length is a method that’s not part of Enumerable - it’s part of a concrete class (like String or Array) and it’s usually running in O(1) (constant) time. That’s as fast as it gets, which means that using it is probably a good idea.
is there a max number of rows that should be in a table to perform a sort operation? I've got a table with a last modified-date on each entry. If I'd like to get e.g. the 50 latest modfiied entries, I first sort by the last-modified-date and then fetch 50 entries. The fetch should be no problem, is there a limit where the sorting could become slow (e.g. > 1 second)?
Thanks a lot,
Stefan
If you declare the sort descriptor as usual and set your fetchLimit to the desired value, core data is going to take care of all the necessary optimizations.
In my experience, for more than 1 second sort time you would need at least tens of thousands of records. Clearly, this depends on hardware and other factors, but you can test it if you want to be sure.