Effective use of Core Data and NSManagedObjectContext - ios

I was wondering, and never found any documentation to it, as to, first of all, what is an efficient way of fetching CD objects. In my app I have to store thousands of objects in my NSMOC and need to fetch them as fast as possible. Ois usi g NSPredicate the best way to fetch the entities that fit what I want the best way?
Secondly, if I want to keep an "archive" type of thing where I store thousands of objects not used often, but I need to have them, is it possible to have two NSMOCs, and if so, will it speed up the process of fetching entities?
Thanks

If you want to fetch CoreData objects efficiently, you have to consider a few approaches:
If you fetch a lot of data, you should do it on a background thread so that you don't block the UI.
NSFetchRequest has a property called propertiesToFetch which you can use to fetch only the properties you need
You can prefetch some of the relationships for later use using the relationshipKeyPathsForPrefetching property of NSFetchRequest
You should set a predicate on your NSFetchRequest object so that you fetch only the data you need
It is also important to structure your entity graph correctly so that you don't fetch unnecessary data when you don't need it.
You can't say that NSPredicate is the best way to fetch CoreData objects because you use NSFetchRequest for this task, but it is definitely is a good idea to use NSPredicate as well.
I don't think that having a separate MOC used as an "archive" is going to help your performance. You should probably read Apple documentation on CoreData performance

Related

Core Data sectionNameKeyPath with Relationship Attribute Performance Issue

I have a Core Data Model with three entities:
Person, Group, Photo with relationships between them as follows:
Person <<-----------> Group (one to many relationship)
Person <-------------> Photo (one to one)
When I perform a fetch using the NSFetchedResultsController in a UITableView, I want to group in sections the Person objects using the Group's entity name attribute.
For that, I use sectionNameKeyPath:#"group.name".
The problem is that when I'm using the attribute from the Group relationship, the NSFetchedResultsController fetches everything upfront in small batches of 20 (I have setFetchBatchSize: 20) instead of fetching batches while I'm scrolling the tableView.
If I use an attribute from the Person entity (like sectionNameKeyPath:#"name") to create sections everything works OK: the NSFetchResultsController loads small batches of 20 objects as I scroll.
The code I use to instantiate the NSFetchedResultsController:
- (NSFetchedResultsController *)fetchedResultsController {
if (_fetchedResultsController) {
return _fetchedResultsController;
}
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:[Person description]
inManagedObjectContext:self.managedObjectContext];
[fetchRequest setEntity:entity];
// Specify how the fetched objects should be sorted
NSSortDescriptor *groupSortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"group.name"
ascending:YES];
NSSortDescriptor *personSortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"birthName"
ascending:YES
selector:#selector(localizedStandardCompare:)];
[fetchRequest setSortDescriptors:[NSArray arrayWithObjects:groupSortDescriptor, personSortDescriptor, nil]];
[fetchRequest setRelationshipKeyPathsForPrefetching:#[#"group", #"photo"]];
[fetchRequest setFetchBatchSize:20];
NSError *error = nil;
NSArray *fetchedObjects = [self.managedObjectContext executeFetchRequest:fetchRequest error:&error];
if (fetchedObjects == nil) {
NSLog(#"Error Fetching: %#", error);
}
_fetchedResultsController = [[NSFetchedResultsController alloc] initWithFetchRequest:fetchRequest
managedObjectContext:self.managedObjectContext sectionNameKeyPath:#"group.name" cacheName:#"masterCache"];
_fetchedResultsController.delegate = self;
return _fetchedResultsController;
}
This is what I get in Instruments if I create sections based on "group.name" without any interaction with the App's UI:
And this is what I get (with a bit of scrolling on UITableView) if sectionNameKeyPath is nil:
Please, can anyone help me out on this issue?
EDIT 1:
It seems that I get inconsistent results from the simulator and Instruments: when I've asked this question, the app was starting in the simulator in about 10 seconds (by Time Profiler) using the above code.
But today, using the same code as above, the app starts in the simulator in 900ms even if it makes a temporary upfront fetch for all the objects and it's not blocking the UI.
I've attached some fresh screenshots:
EDIT 2:
I reset the simulator and the results are intriguing: after performing an import operation and quitting the app the first run looked like this:
After a bit of scrolling:
Now this is what happens on a second run:
After the fifth run:
EDIT 3:
Running the app the seventh time and eight time, I get this:
This is your stated objective: "I need the Person objects to be grouped in sections by the relationship entity Group, name attribute and the NSFetchResultsController to perform fetches in small batches as I scroll and not upfront as it is doing now."
The answer is a little complicated, primarily because of how an NSFetchedResultsController builds sections, and how that affects the fetching behavior.
TL;DR; To change this behavior, you would need to change how NSFetchedResultsController builds sections.
What is happening?
When an NSFetchedResultsController is given a fetch request with pagination (fetchLimit and/or fetchBatchSize), several things happen.
If no sectionNameKeyPath is specified, it does exactly what you expect. The fetch returns an proxy array of results, with "real" objects for the first fetchBathSize number of items. So for example if you have setFetchBatchSize to 2, and your predicate matches 10 items in the store, the results contain the first two objects. The other objects will be fetched separately as they are accessed. This provides a smooth paginated response experience.
However, when a sectionNameKeyPath is specified, the fetched results controller has to do a bit more. To compute the sections it needs to access that key path on all the objects in the results. It enumerates the 10 items in the results in our example. The first two have already been fetched. The other 8 will be fetched during enumeration to get the key path value needed to build the section information. If you have a lot of results for your fetch request, this can be very inefficient. There are a number of public bugs concerning this functionality:
NSFetchedResultsController initially takes too long to set up sections
NSFetchedResultsController ignores fetchLimit property
NSFetchedResultsController, Table Index, and Batched Fetch Performance Issue
... And several others. When you think about it, this makes sense. To build the NSFetchedResultsSectionInfo objects requires the fetched results controller to see every value in the results for the sectionNameKeyPath, aggregate them to the unique union of values, and use that information to create the correct number of NSFetchedResultsSectionInfo objects, set the name and index title, know how many objects in the results a section contains, etc. To handle the general use case there is no way around this. With that in mind, your Instruments traces may make a lot more sense.
How can you change this?
You can attempt to build your own NSFetchedResultsController that provides an alternative strategy for building NSFetchedResultsSectionInfo objects, but you may run into some of the same problems. For example, if you are using the existing fetchedObjects functionality to access members of the fetch results, you will encounter the same behavior when accessing objects that are faults. Your implementation would need a strategy for dealing with this (it's doable, but very dependant on your needs and requirements).
Oh god no. What about some kind of temporary hack that just makes it perform a little better but doesn't fix the problem?
Altering your data model will not change the above behavior, but can change the performance impact slightly. Batch updates will not have any significant effect on this behavior, and in fact will not play nicely with a fetched results controller. It may be much more useful to you, however, to instead set the relationshipKeyPathsForPrefetching to include your "group" relationship, which may improve the fetching and faulting behavior significantly. Another strategy may be to perform another fetch to batch fault these objects before you attempt to use the fetched results controller, which will populate the various levels of Core Data in-memory caches in a more efficient manner.
The NSFetchedResultsController cache is primarily for section information. This prevents the sections from having to be completely recalculated on each change (in the best case), but can actually make the initial fetch to build the sections take much longer. You will have to experiment to see if the cache is worthwhile for your use case.
If your primary concern is that these Core Data operations are blocking user interaction, you can offload them from the main thread. NSFetchedResultsController can be used on a private queue (background) context, which will prevent Core Data operations from blocking the UI.
Based on my experience a way to achieve your goal is to denormalize your model. In particular, you could add a group attribute in your Person entity and use that attribute as sectionNameKeyPath. So, when you create a Person you should also pass the group it belongs to.
This denormalization process is correct since it allows you to avoid fetching of related Group objects since not necessary. A cons could be that if you change the name of a group, all the persons associated with that name must change, on the contrary you can have incorrect values.
The key aspect here is the following. You need to have in mind that Core Data is not a
relational database. The model should not designed as a database schema, where normalization could take place, but it should be designed from the perspective of how the data are presented and used in the user interface.
Edit 1
I cannot understand your comment, could you explain better?
What I've found very intriguing though is that even if the app is
performing a full upfront fetch in the simulator, the app loads in
900ms (with 5000 objects) on the device despite the simulator where it
loads much slower.
Anyway, I would be interested in knowing details about your Photo entity. If you pre-fetch photo the overall execution could be influenced.
Do you need to pre-fetch a Photo within your table view? Are they thumbs (small photos)? Or normal images? Do you take advantage of External Storage Flag?
Adding an additional attribute (say group) to the Person entity could not be a problem. Updating the value of that attribute when the name of a Group object changes it's not a problem if you perform it in background. In addition, starting from iOS 8 you have available a batch update as described in Core Data Batch Updates.
After almost a year since I've posted this question, I've finally found the culprits that enable this behaviour (which slightly changed in Xcode 6):
Regarding the inconsistent fetch times: I was using a cache and at the time I was back and forth with opening, closing and resetting the simulator.
Regarding the fact that everything was fetched upfront in small batches without scrolling (in Xcode 6's Core Data Instruments that's not the case anymore - now it's one, big fetch which takes entire seconds):
It seems that setFetchBatchSize does not work correctly with parent/child contexts. The issue was reported back in 2012 and it seems that it's still there http://openradar.appspot.com/11235622.
To overcome this issue, I created another independent context with an NSMainQueueConcurrencyType and set its persistence coordinator to be the same that my other contexts are using.
More about issue #2 here: https://stackoverflow.com/a/11470560/1641848

Core Data working with subset of full data store...HOW?

I'm banging my head trying to figure out something in Core Data that I think should be simple to do, and I need some help.
I have a data store which contains data from the past two years, but in my app, I have certain criteria so that the user only works with a subset of that data (i.e. just the past month). I have created the predicates to generate the fetch request and all that works fine.
My issue is that I then want to run some additional predicates on this subset of data (i.e. I just want objects with name=Sally). I'd like to do so without having to re-run the original predicate with an additional predicate (in a NSCompoundPredicate); I'd rather just run it on the subset of data already created.
Can I just run a predicate on the fetch results?
Is the format of the predicate the same as for the initial calls into the core data store?
Thanks for any help.
You could filter the original array of results using a predicate. See the NSArray filteredArrayUsingPredicate method.
The only reason I would consider doing what you're talking about is if you find that modifying the predicate at the FetchedResultsController level caused significant performance degradation.
My recommendation is to get a reference to your fetchedResultsController in your view controller and update your predicate with a compound predicate matching your additional search parameters.
If you were to tie your viewControllers data source to a predicate of a predicate you wouldn't be able to properly utilize the NSFetchedResultsControllers Delegate methods that allow for easy, dynamic updating of Views such as table view and collection view.
/* assuming you're abstracting your datastore with a singleton */
self.fetchedResultsController = [DataStore sharedStore].fetchedResultsController;
self.fetchedResultsController.delegate = self;
[self.fetchedResultsController.fetchRequest setPredicate:yourPredicate];
Make sure to configure your fetch requests' batch size to a reasonable value to improve performance.

Correct way to accessing attributes of a class modeled in core data

Let's say a Recipe object has an NSSet of one or more Ingredients, and that the same relationship is modeled in core data.
Given a recipe, what id the correct way to access its ingredients?
In this example it seems natural to use recipe.ingredients, but I could equally use an NSFetchRequest for Ingredient entities with an NSPredicate to match by recipe.
Now let's say I want only the ingredients that are 'collected'. This is less clear cut to me - should I use a fetch request for ingredients with a predicate restricting by recipe and collected state? Or loop through recipe.ingredients?
At the other end of the scale, perhaps I need only ingredients from this recipe that also appear in other recipes. Now, the fetch request seems more appealing.
What is the correct general approach? Or is it a case by case scenario? I am interested in the impact on:
Consitancy
Readability
Performance
Robustness (for example, it is easy to make an error in a fetch request that the compiler cannot catch).
Let's go through these in order.
Getting the ingredients for a specific Recipe, when you already have a reference: Use recipe.ingredients every time.
Getting the ingredients for a specific Recipe that have a specific value (e.g. a Boolean flag value): Easiest is probably to start with recipe.ingredients as above and then use something like NSSet's objectsPassingTest to filter them. Most elegant is to set up a fetched property on Recipe that just returns these ingredients with no extra code (the syntax may not be immediately obvious, see a previous answer I wrote for details). These two probably perform about equally. Least appealing is a fetch request.
Getting ingredients that appear in multiple recipe instances: Probably a fetch request for the Ingredient entity where the predicate is something like recipe in %#, and the %# is replaced by a list of Recipe instances.
Some basic info:
*memory operations are ~100-1000 times faster then disk operations.
*A fetch request execution is always a trip to the store (disk), and so, degrade performance.
In your case, you have a "small" set of objects that need to be queried for information.
simply iterating over them using the recipe.ingredients set would fault them one by one, each access will be a trip to the store (fault resolution).
In this case, use prefetching (either in the request, set the setRelationshipKeyPathsForPrefetching: to prefetch the ingredients relationship or execute a fetch request that fetch the set with the appropriate predicate).
If you need specific data only, then use the fetch request approach to retrieve only the data you need.
if you intend to repeatedly access the relationship for queries and info, just fetch the entire set by using prefetching, and query in-memory.
My point is:
Think of the approach that minimize your disk access (in any case you need at least 1 access).
If your data is too large to fit in memory, or to be queried in memory, perform a fetch to get only the data you need.
Now:
1.Consistency - Pick a method you find comfortable and stick with it (i use prefetching)
2.Readability - Using a property is much more readable then executing a query, however it is less efficient if not using prefetching.
3.Performance - Disk access degrade performance, but is unavoidable in some situations
4.Robustness - A fetch request show that you know what is best for your data usage. use it wisely.
To make sure you are minimising disk access, turn SQLite debug on
(-com.apple.CoreData.SQLDebug)
Edit:
Faulting behaviour
In this example it seems natural to use recipe.ingredients, but I
could equally use an NSFetchRequest for Ingredient entities with an
NSPredicate to match by recipe.
Why would you do the latter when you can do the former? You already have the recipe, and it already has a set of ingredients, so there's no need to look at all the ingredients and filter out just those that are related to the recipe that you already have.
Now let's say I want only the ingredients that are 'collected'. This
is less clear cut to me - should I use a fetch request for ingredients
with a predicate restricting by recipe and collected state? Or loop
through recipe.ingredients?
Apply the predicate to the recipe's ingredients:
NSPredicate *isCollected = [NSPredicate predicateWithFormat:#"collected == YES"];
NSSet *collectedIngredients = [recipe.ingredients filteredSetUsingPredicate:isCollected];
At the other end of the scale, perhaps I need only ingredients from
this recipe that also appear in other recipes. Now, the fetch request
seems more appealing.
Again, using a fetch request here seems wasteful because you already have easy access to the set of ingredients that could be in the final result, and that's potentially a much smaller set than the set of all ingredients. Use the same approach as above, but change the predicate to test the recipes associated with each ingredient. Something like:
NSPredicate *p = [NSPredicate predicateWithFormat:#"recipes > 1"];
NSSet *i = [recipe.ingredients filteredSetUsingPredicate:p];
What is the correct general approach?
Fetch requests are a good way to search through all instances of a given entity. You're always going to start with a fetch request to get some objects to work with. But when the objects you want are somehow related to an object that you already have you can (and should) use those relationships to get what you want.

NSFetchRequest of simply filter the relation?

Assume a NSManagedObject A and one B. Now A has a to-many B relationship on bs.
Is [A.bs filter...] with some NSPredicate is very convenient, but likely slower then building up a NSFetchRequest on B with the same predicate and a condition to match the relation, or am I mistaken?
I guess this performance issue is even worse, if you do something like lastObject on the result to obtain only a single result. (The NSFetchRequest offers a fetchLimit property that can be exploited for this purpose).
On a similar note, if you are just interested in one or two properties, NSFetchRequest providers a propertiesToFetch property as well.
My reasoning behind this is, that using the relation directly requires core data to pull all NSManagedObjects into the relevant NSManagedObjectContext. While the NSFetchRequest can perform optimizations on the store-level.
Now:
is my reasoning correct? Thus is the takeaway, that if you are not interested in all relation objects, go with a NSFetchRequest?
is there a solution to have the convenient (and obviously more readable) approach via the relations having similar performance?
Yes your assumption is correct that performance wise fetch requests can use optimizations while the filtering approach is actually a two stage approach where first you fetch all your objects into a NSSet and then send -filteredSetUsingPredicate: to the NSSet which is out of core data's scope already.
One alternative would be to use fetch request templates in your model and instantiate them in your code with [managedObjectModel fetchRequestFromTemplateWithName:substitutionVariables:]
Your reasoning is correct.
I would add NSFetchRequests offer much more flexibility and opportunities to optimize the requests.
Fetch requests can uses indexes in the underlaying database.
Fetch requests can retrieve objects as faults instead of loading them in full.
Fetch requests can preload some chosen attributes.
They can even perform some operations for you (see NSExpressionDescription, NSExpression).
...
Even if you need to use some in-memory filtering (with objectsWithOptions:passingTest: per example), you better have to use a fetch request to preload the objects and attributes you need. Many fault resolutions will be slower in any case.

CoreData sorted entities - performance

I would like to have my CoreData entities stored in a sorted array/set, so that I don't have to sort it every time in a fetch using NSSortDescriptor. In iOS 4 and below, I believe this is my only option, but sorting the entire data set (not just the fetched results) every time - that sounds terribly inefficient, even for a relatively small data set of ~10k.
In iOS 5 there are sorted sets; I wonder if the performance gain is enough to warrant dropping iOS 4 support? Any experiences to be shared?
You don't need to sort the results - you can do your fetch and pass sortOrderings to the fetch. If you need to traverse a relationship, you can add a method of your own on the parent that does a raw fetch on the child, has a predicate added where the parent relationship is self, add your own sort orderings and returns a nice NSArray :)
I mean hey, if you can force iOS5, great, but if you need to keep iOS 4 around, letting the database do the sorting is very performant.

Resources