Core Data: Parent context and change propagation - ios

I have the following Core Data setup in my app:
Persistent Store Coordinator
^ Background MOC (NSPrivateQueueConcurrencyType)
^ Main Queue MOC (NSMainQueueConcurrencyType)
Here is the initialization code:
_backgroundContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[_backgroundContext setPersistentStoreCoordinator:self.coordinator];
_mainContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
[_mainContext setParentContext:_backgroundContext];
I use the background MOC for importing large amounts of data. I also use it to perform complex fetch requests in the background and then pass the object IDs to the main queue to fetch the objects using these IDs.
This works quite well. However, I am not sure how to let the main queue MOC know about the changes made in the background MOC. I know that if I perform a fetch request on the main queue MOC, it will get the changes, but that's not what I want.
Is it OK to use the NSManagedObjectContextObjectsDidChangeNotification notification posted by the background MOC and call mergeChangesFromContextDidSaveNotification: on the main queue MOC? This should then cause the NSManagedObjectContextObjectsDidChangeNotification notification of the main queue MOC to fire. I am listening for this notification in my view controllers and examine the userInfo for changes and redisplay data accordingly.
I think you usually do it this way if you have one persistent store coordinator with two attached MOCs. But I am not sure if it is the right way to do, when you have child/parent contexts.

Having the main MOC use a private parent MOC for asynchronous I/O is fine. However, you should not use that parent MOC for anything but performing background work on behalf of the main MOC. There are many reasons for this (among them performance and nasty issues related to transient object IDs).
If you want to do background updating of the store, here is what I suggest.
PSC <--+-- PrivateMOC <---- MainMOC
|
+-- BackgroundPrivateMOC
This will allow background operation that causes the least interruption to the main MOC, while allowing the PSC caches to be shared.
Now, for sharing data...
The MainMOC should listen for and merge DidSave notifications from the BackgroundPrivateMO.
The BackgroundMOC can listen for and merge DidSave notifications from the PrivateMOC.
This allows merging to use only permanent object IDs and optimizes performance.

I'd say that listening to NSManagedObjectContextObjectsDidChangeNotification notification is not probably the best solution.
The way I do it and it works is following.
Here is main context creation:
_mainContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
_mainContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
_mainContext.persistentStoreCoordinator = _persistentStoreCoordinator;
Here is background context creation:
_backgroundContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
_backgroundContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
_backgroundContext.parentContext = self.mainContext;
Now, background context is only for writing (or reading) objects (may be in background thread). Main context is only for reading from the main queue.
Save on background context should look like:
__block BOOL saved = [_backgroundContext save:error];
if (saved && _backgroundContext.parentContext) {
[_backgroundContext.parentContext performBlockAndWait:^{
saved = [self.parentContext save:error];
}];
}
This save method guarantees that all changes will be propagated to main context. If you do a lot of work in many background threads get more familiar with performBlockAndWait: method, which provides mutual exclusion on context.
If you want to be notified about objects' changes, you don't have to listen for notification, you can simply setup NSFetchedResultsController and register as its delegate.

Related

Core Data background context best practice

I have a large import task I need to do with core data.
Let say my core data model look like this:
Car
----
identifier
type
I fetch a list of car info JSON from my server and then I want to sync it with my core data Car object, meaning:
If its a new car -> create a new Core Data Car object from the new info.
If the car already exists -> update the Core Data Car object.
So I want to do this import in background without blocking the UI and while the use scrolls a cars table view that present all the cars.
Currently I'm doing something like this:
// create background context
NSManagedObjectContext *bgContext = [[NSManagedObjectContext alloc]initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[bgContext setParentContext:self.mainContext];
[bgContext performBlock:^{
NSArray *newCarsInfo = [self fetchNewCarInfoFromServer];
// import the new data to Core Data...
// I'm trying to do an efficient import here,
// with few fetches as I can, and in batches
for (... num of batches ...) {
// do batch import...
// save bg context in the end of each batch
[bgContext save:&error];
}
// when all import batches are over I call save on the main context
// save
NSError *error = nil;
[self.mainContext save:&error];
}];
But I'm not really sure I'm doing the right thing here, for example:
Is it ok that I use setParentContext ?
I saw some examples that use it like this, but I saw other examples that don't call setParentContext, instead they do something like this:
NSManagedObjectContext *bgContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
bgContext.persistentStoreCoordinator = self.mainContext.persistentStoreCoordinator;
bgContext.undoManager = nil;
Another thing that I'm not sure is when to call save on the main context, In my example I just call save in the end of the import, but I saw examples that uses:
[[NSNotificationCenter defaultCenter] addObserverForName:NSManagedObjectContextDidSaveNotification object:nil queue:nil usingBlock:^(NSNotification* note) {
NSManagedObjectContext *moc = self.managedObjectContext;
if (note.object != moc) {
[moc performBlock:^(){
[moc mergeChangesFromContextDidSaveNotification:note];
}];
}
}];
As I mention before, I want the user to be able to interact with the data while updating, so what if I the user change a car type while the import change the same car, is the way I wrote it safe?
UPDATE:
Thanks to #TheBasicMind great explanation I'm trying to implement option A, so my code looks something like:
This is the Core Data configuration in AppDelegate:
AppDelegate.m
#pragma mark - Core Data stack
- (void)saveContext {
NSError *error = nil;
NSManagedObjectContext *managedObjectContext = self.managedObjectContext;
if (managedObjectContext != nil) {
if ([managedObjectContext hasChanges] && ![managedObjectContext save:&error]) {
DDLogError(#"Unresolved error %#, %#", error, [error userInfo]);
abort();
}
}
}
// main
- (NSManagedObjectContext *)managedObjectContext {
if (_managedObjectContext != nil) {
return _managedObjectContext;
}
_managedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
_managedObjectContext.parentContext = [self saveManagedObjectContext];
return _managedObjectContext;
}
// save context, parent of main context
- (NSManagedObjectContext *)saveManagedObjectContext {
if (_writerManagedObjectContext != nil) {
return _writerManagedObjectContext;
}
NSPersistentStoreCoordinator *coordinator = [self persistentStoreCoordinator];
if (coordinator != nil) {
_writerManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[_writerManagedObjectContext setPersistentStoreCoordinator:coordinator];
}
return _writerManagedObjectContext;
}
And this is how my import method looks like now:
- (void)import {
NSManagedObjectContext *saveObjectContext = [AppDelegate saveManagedObjectContext];
// create background context
NSManagedObjectContext *bgContext = [[NSManagedObjectContext alloc]initWithConcurrencyType:NSPrivateQueueConcurrencyType];
bgContext.parentContext = saveObjectContext;
[bgContext performBlock:^{
NSArray *newCarsInfo = [self fetchNewCarInfoFromServer];
// import the new data to Core Data...
// I'm trying to do an efficient import here,
// with few fetches as I can, and in batches
for (... num of batches ...) {
// do batch import...
// save bg context in the end of each batch
[bgContext save:&error];
}
// no call here for main save...
// instead use NSManagedObjectContextDidSaveNotification to merge changes
}];
}
And I also have the following observer:
[[NSNotificationCenter defaultCenter] addObserverForName:NSManagedObjectContextDidSaveNotification object:nil queue:nil usingBlock:^(NSNotification* note) {
NSManagedObjectContext *mainContext = self.managedObjectContext;
NSManagedObjectContext *otherMoc = note.object;
if (otherMoc.persistentStoreCoordinator == mainContext.persistentStoreCoordinator) {
if (otherMoc != mainContext) {
[mainContext performBlock:^(){
[mainContext mergeChangesFromContextDidSaveNotification:note];
}];
}
}
}];
This is an extremely confusing topic for people approaching Core Data for the first time. I don't say this lightly, but with experience, I am confident in saying the Apple documentation is somewhat misleading on this matter (it is in fact consistent if you read it very carefully, but they don't adequately illustrate why merging data remains in many instances a better solution than relying on parent/child contexts and simply saving from a child to the parent).
The documentation gives the strong impression parent/child contexts are the new preferred way to do background processing. However Apple neglect to highlight some strong caveats. Firstly, be aware that everything you fetch into your child context is first pulled through it's parent. Therefore it is best to limit any child of the main context running on the main thread to processing (editing) data that has already been presented in the UI on the main thread. If you use it for general synchronisation tasks it is likely you will be wanting to process data which extends far beyond the bounds of what you are currently displaying in the UI. Even if you use NSPrivateQueueConcurrencyType, for the child edit context, you will potentially be dragging a large amount of data through the main context and that can lead to bad performance and blocking. Now it is best not to make the main context a child of the context you use for synchronisation, because it won't be notified of synchronisation updates unless you are going to do that manually, plus you will be executing potentially long running tasks on a context you might need to be responsive to saves initiated as a cascade from the edit context that is a child of your main context, through the main contact and down to the data store. You will have to either manually merge the data and also possibly track what needs to be invalidated in the main context and re-sync. Not the easiest pattern.
What the Apple documentation does not make clear is that you are most likely to need a hybrid of the techniques described on the pages describing the "old" thread confinement way of doing things, and the new Parent-Child contexts way of doing things.
Your best bet is probably (and I'm giving a generic solution here, the best solution may be dependent on your detailed requirements), to have a NSPrivateQueueConcurrencyType save context as the topmost parent, which saves directly to the datastore. [Edit: you won't be doing very much directly on this context], then give that save context at least two direct children. One your NSMainQueueConcurrencyType main context you use for the UI [Edit: it's best to be disciplined and avoid ever doing any editing of the data on this context], the other a NSPrivateQueueConcurrencyType, you use to do user edits of the data and also (in option A in the attached diagram) your synchronisation tasks.
Then you make the main context the target of the NSManagedObjectContextDidSave notification generated by the sync context, and send the notifications .userInfo dictionary to the main context's mergeChangesFromContextDidSaveNotification:.
The next question to consider is where you put the user edit context (the context where edits made by the user get reflected back into the interface). If the user's actions are always confined to edits on small amounts of presented data, then making this a child of the main context again using the NSPrivateQueueConcurrencyType is your best bet and easiest to manage (save will then save edits directly into the main context and if you have an NSFetchedResultsController, the appropriate delegate method will be called automatically so your UI can process the updates controller:didChangeObject:atIndexPath:forChangeType:newIndexPath:) (again this is option A).
If on the other hand user actions might result in large amounts of data being processed, you might want to consider making it another peer of the main context and the sync context, such that the save context has three direct children. main, sync (private queue type) and edit (private queue type). I've shown this arrangement as option B on the diagram.
Similarly to the sync context you will need to [Edit: configure the main context to receive notifications] when data is saved (or if you need more granularity, when data is updated) and take action to merge the data in (typically using mergeChangesFromContextDidSaveNotification:). Note that with this arrangement, there is no need for the main context to ever call the save: method.
To understand parent/child relationships, take Option A: The parent child approach simply means if the edit context fetches NSManagedObjects, they will be "copied into" (registered with) first the save context, then the main context, then finally edit context. You will be able to make changes to them, then when you call save: on the edit context, the changes will saved just to the main context. You would have to call save: on the main context and then call save: on the save context before they will be written out to disk.
When you save from a child, up to a parent, the various NSManagedObject change and save notifications are fired. So for example if you are using a fetch results controller to manage your data for your UI, then it's delegate methods will be called so you can update the UI as appropriate.
Some consequences: If you fetch object and NSManagedObject A on the edit context, then modify it, and save, so the modifications are returned to the main context. You now have the modified object registered against both the main and the edit context. It would be bad style to do so, but you could now modify the object again on the main context and it will now be different from the object as it is stored in the edit context. If you then try to make further modifications to the object as stored in the edit context, your modifications will be out of sync with the object on the main context, and any attempt to save the edit context will raise an error.
For this reason, with an arrangement like option A, it is a good pattern to try to fetch objects, modify them, save them and reset the edit context (e.g. [editContext reset] with any single iteration of the run-loop (or within any given block passed to [editContext performBlock:]). It is also best to be disciplined and avoid ever doing any edits on the main context.
Also, to re-iterate, since all processing on main is the main thread, if you fetch lots of objects to the edit context, the main context will be doing it's fetch processing on the main thread as those objects are being copied down iteratively from parent to child contexts. If there is a lot of data being processed, this can cause unresponsiveness in the UI. So if, for example you have a large store of managed objects, and you have a UI option that would result in them all being edited. It would be a bad idea in this case to configure your App like option A. In such a case option B is a better bet.
If you aren't processing thousands of objects, then option A may be entirely sufficient.
BTW don't worry too much over which option you select. It might be a good idea to start with A and if you need to change to B. It's easier than you might think to make such a change and usually has fewer consequences than you might expect.
Firstly, parent/child context are not for background processing. They are for atomic updates of related data that might be created in multiple view controllers. So if the last view controller is cancelled, the child context can be thrown away with no adverse affects on the parent. This is fully explained by Apple at the bottom of this answer at [^1]. Now that is out of the way and you haven't fallen for the common mistake, you can focus on how to properly do background Core Data.
Create a new persistent store coordinator (no longer needed on iOS 10 see update below) and a private queue context. Listen for the save notification and merge the changes into the main context (on iOS 10 the context has a property to do this automatically)
For a sample by Apple see "Earthquakes: Populating a Core Data Store Using a Background Queue"
https://developer.apple.com/library/mac/samplecode/Earthquakes/Introduction/Intro.html
As you can see from the revision history on 2014-08-19 they added
"New sample code that shows how to use a second Core Data stack to fetch data on a background queue."
Here is that bit from AAPLCoreDataStackManager.m:
// Creates a new Core Data stack and returns a managed object context associated with a private queue.
- (NSManagedObjectContext *)createPrivateQueueContext:(NSError * __autoreleasing *)error {
// It uses the same store and model, but a new persistent store coordinator and context.
NSPersistentStoreCoordinator *localCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:[AAPLCoreDataStackManager sharedManager].managedObjectModel];
if (![localCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil
URL:[AAPLCoreDataStackManager sharedManager].storeURL
options:nil
error:error]) {
return nil;
}
NSManagedObjectContext *context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[context performBlockAndWait:^{
[context setPersistentStoreCoordinator:localCoordinator];
// Avoid using default merge policy in multi-threading environment:
// when we delete (and save) a record in one context,
// and try to save edits on the same record in the other context before merging the changes,
// an exception will be thrown because Core Data by default uses NSErrorMergePolicy.
// Setting a reasonable mergePolicy is a good practice to avoid that kind of exception.
context.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
// In OS X, a context provides an undo manager by default
// Disable it for performance benefit
context.undoManager = nil;
}];
return context;
}
And in AAPLQuakesViewController.m
- (void)contextDidSaveNotificationHandler:(NSNotification *)notification {
if (notification.object != self.managedObjectContext) {
[self.managedObjectContext performBlock:^{
[self.managedObjectContext mergeChangesFromContextDidSaveNotification:notification];
}];
}
}
Here is the full description of how the sample is designed:
Earthquakes: Using a "private" persistent store coordinator to fetch data in background
Most applications that use Core Data employ a single persistent store coordinator to mediate access to a given persistent store. Earthquakes shows how to use an additional "private" persistent store coordinator when creating managed objects using data retrieved from a remote server.
Application Architecture
The application uses two Core Data "stacks" (as defined by the existence of a persistent store coordinator). The first is the typical "general purpose" stack; the second is created by a view controller specifically to fetch data from a remote server (As of iOS 10 a second coordinator is no longer needed, see update at bottom of answer).
The main persistent store coordinator is vended by a singleton "stack controller" object (an instance of CoreDataStackManager). It is the responsibility of its clients to create a managed object context to work with the coordinator[^1]. The stack controller also vends properties for the managed object model used by the application, and the location of the persistent store. Clients can use these latter properties to set up additional persistent store coordinators to work in parallel with the main coordinator.
The main view controller, an instance of QuakesViewController, uses the stack controller's persistent store coordinator to fetch quakes from the persistent store to display in a table view. Retrieving data from the server can be a long-running operation which requires significant interaction with the persistent store to determine whether records retrieved from the server are new quakes or potential updates to existing quakes. To ensure that the application can remain responsive during this operation, the view controller employs a second coordinator to manage interaction with the persistent store. It configures the coordinator to use the same managed object model and persistent store as the main coordinator vended by the stack controller. It creates a managed object context bound to a private queue to fetch data from the store and commit changes to the store.
[^1]: This supports the "pass the baton" approach whereby—particularly in iOS applications—a context is passed from one view controller to another. The root view controller is responsible for creating the initial context, and passing it to child view controllers as/when necessary.
The reason for this pattern is to ensure that changes to the managed object graph are appropriately constrained. Core Data supports "nested" managed object contexts which allow for a flexible architecture that make it easy to support independent, cancellable, change sets. With a child context, you can allow the user to make a set of changes to managed objects that can then either be committed wholesale to the parent (and ultimately saved to the store) as a single transaction, or discarded. If all parts of the application simply retrieve the same context from, say, an application delegate, it makes this behavior difficult or impossible to support.
Update: In iOS 10 Apple moved synchronisation from the sqlite file level up to the persistent coordinator. This means you can now create a private queue context and reuse the existing coordinator used by the main context without the same performance problems you would have had doing it that way before, cool!
By the way this document of Apple is explaining this problem very clearly. Swift version of above for anyone interested
let jsonArray = … //JSON data to be imported into Core Data
let moc = … //Our primary context on the main queue
let privateMOC = NSManagedObjectContext(concurrencyType: .PrivateQueueConcurrencyType)
privateMOC.parentContext = moc
privateMOC.performBlock {
for jsonObject in jsonArray {
let mo = … //Managed object that matches the incoming JSON structure
//update MO with data from the dictionary
}
do {
try privateMOC.save()
moc.performBlockAndWait {
do {
try moc.save()
} catch {
fatalError("Failure to save context: \(error)")
}
}
} catch {
fatalError("Failure to save context: \(error)")
}
}
And even simpler if you are using NSPersistentContainer for iOS 10 and above
let jsonArray = …
let container = self.persistentContainer
container.performBackgroundTask() { (context) in
for jsonObject in jsonArray {
let mo = CarMO(context: context)
mo.populateFromJSON(jsonObject)
}
do {
try context.save()
} catch {
fatalError("Failure to save context: \(error)")
}
}

best way to concurrently fetch data from a server and modify core data graph as the data arrives? [duplicate]

Question: How do I get my child context to see changes persisted on the parent context so that they trigger my NSFetchedResultsController to update the UI?
Here's the setup:
You've got an app that downloads and adds lots of XML data (about 2 million records, each roughly the size of a normal paragraph of text) The .sqlite file becomes about 500 MB in size. Adding this content into Core Data takes time, but you want the user to be able to use the app while the data loads into the data store incrementally. It's got to be invisible and imperceptible to the user that large amounts of data are being moved around, so no hangs, no jitters: scrolls like butter. Still, the app is more useful, the more data is added to it, so we can't wait forever for the data to be added to the Core Data store. In code this means I'd really like to avoid code like this in the import code:
[[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.25]];
The app is iOS 5 only so the slowest device it needs to support is an iPhone 3GS.
Here are the resources I've used so far to develop my current solution:
Apple's Core Data Programming Guide: Efficiently Importing Data
Use Autorelease Pools to keep the memory down
Relationships Cost. Import flat, then patch up relationships at the end
Don't query if you can help it, it slows things down in an O(n^2) manner
Import in Batches: save, reset, drain and repeat
Turn off the Undo Manager on import
iDeveloper TV - Core Data Performance
Use 3 Contexts: Master, Main and Confinement context types
iDeveloper TV - Core Data for Mac, iPhone & iPad Update
Running saves on other queues with performBlock makes things fast.
Encryption slows things down, turn it off if you can.
Importing and Displaying Large Data Sets in Core Data by Marcus Zarra
You can slow down the import by giving time to the current run loop,
so things feel smooth to the user.
Sample Code proves that it is possible to do large imports and keep the UI responsive, but not as fast as with 3 contexts and async saving to disk.
My Current Solution
I've got 3 instances of NSManagedObjectContext:
masterManagedObjectContext - This is the context that has the NSPersistentStoreCoordinator and is responsible for saving to disk. I do this so my saves can be asynchronous and therefore very fast. I create it on launch like this:
masterManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[masterManagedObjectContext setPersistentStoreCoordinator:coordinator];
mainManagedObjectContext - This is the context the UI uses everywhere. It is a child of the masterManagedObjectContext. I create it like this:
mainManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
[mainManagedObjectContext setUndoManager:nil];
[mainManagedObjectContext setParentContext:masterManagedObjectContext];
backgroundContext - This context is created in my NSOperation subclass that is responsible for importing the XML data into Core Data. I create it in the operation's main method and link it to the master context there.
backgroundContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSConfinementConcurrencyType];
[backgroundContext setUndoManager:nil];
[backgroundContext setParentContext:masterManagedObjectContext];
This actually works very, VERY fast. Just by doing this 3 context setup I was able to improve my import speed by over 10x! Honestly, this is hard to believe. (This basic design should be part of the standard Core Data template...)
During the import process I save 2 different ways. Every 1000 items I save on the background context:
BOOL saveSuccess = [backgroundContext save:&error];
Then at the end of the import process, I save on the master/parent context which, ostensibly, pushes modifications out to the other child contexts including the main context:
[masterManagedObjectContext performBlock:^{
NSError *parentContextError = nil;
BOOL parentContextSaveSuccess = [masterManagedObjectContext save:&parentContextError];
}];
Problem: The problem is that my UI will not update until I reload the view.
I have a simple UIViewController with a UITableView that is being fed data using a NSFetchedResultsController. When the Import process completes, the NSFetchedResultsController see's no changes from the parent/master context and so the UI doesn't automatically update like I'm used to seeing. If I pop the UIViewController off the stack and load it again all the data is there.
Question: How do I get my child context to see changes persisted on the parent context so that they trigger my NSFetchedResultsController to update the UI?
I have tried the following which just hangs the app:
- (void)saveMasterContext {
NSNotificationCenter *notificationCenter = [NSNotificationCenter defaultCenter];
[notificationCenter addObserver:self selector:#selector(contextChanged:) name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
NSError *error = nil;
BOOL saveSuccess = [masterManagedObjectContext save:&error];
[notificationCenter removeObserver:self name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
}
- (void)contextChanged:(NSNotification*)notification
{
if ([notification object] == mainManagedObjectContext) return;
if (![NSThread isMainThread]) {
[self performSelectorOnMainThread:#selector(contextChanged:) withObject:notification waitUntilDone:YES];
return;
}
[mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
}
You should probably save the master MOC in strides as well. No sense having that MOC wait until the end to save. It has its own thread, and it will help keep memory down as well.
You wrote:
Then at the end of the import process, I save on the master/parent
context which, ostensibly, pushes modifications out to the other child
contexts including the main context:
In your configuration, you have two children (the main MOC and the background MOC), both parented to the "master."
When you save on a child, it pushes the changes up into the parent. Other children of that MOC will see the data the next time they perform a fetch... they are not explicitly notified.
So, when BG saves, its data is pushed to MASTER. Note, however, that none of this data is on disk until MASTER saves. Furthermore, any new items will not get permanent IDs until the MASTER saves to disk.
In your scenario, you are pulling the data into the MAIN MOC by merging from the MASTER save during the DidSave notification.
That should work, so I'm curious as to where it is "hung." I will note, that you are not running on the main MOC thread in the canonical way (at least not for iOS 5).
Also, you probably only are interested in merging changes from the master MOC (though your registration looks like it is only for that anyway). If I were to use the update-on-did-save-notification, I'd do this...
- (void)contextChanged:(NSNotification*)notification {
// Only interested in merging from master into main.
if ([notification object] != masterManagedObjectContext) return;
[mainManagedObjectContext performBlock:^{
[mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
// NOTE: our MOC should not be updated, but we need to reload the data as well
}];
}
Now, for what may be your real issue regarding the hang... you show two different calls to save on the master. the first is well protected in its own performBlock, but the second is not (though you may be calling saveMasterContext in a performBlock...
However, I'd also change this code...
- (void)saveMasterContext {
NSNotificationCenter *notificationCenter = [NSNotificationCenter defaultCenter];
[notificationCenter addObserver:self selector:#selector(contextChanged:) name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
// Make sure the master runs in it's own thread...
[masterManagedObjectContext performBlock:^{
NSError *error = nil;
BOOL saveSuccess = [masterManagedObjectContext save:&error];
// Handle error...
[notificationCenter removeObserver:self name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
}];
}
However, note that the MAIN is a child of MASTER. So, it should not have to merge the changes. Instead, just watch for the DidSave on the master, and just refetch! The data is sitting in your parent already, just waiting for you to ask for it. That's one of the benefits of having the data in the parent in the first place.
Another alternative to consider (and I'd be interested to hear about your results -- that's a lot of data)...
Instead of making the background MOC a child of the MASTER, make it a child of the MAIN.
Get this. Every time the BG saves, it automatically gets pushed into the MAIN. Now, the MAIN has to call save, and then the master has to call save, but all those are doing is moving pointers... until the master saves to disk.
The beauty of that method is that the data goes from the background MOC straight into your applications MOC (then passes through to get saved).
There is some penalty for the pass-through, but all the heavy lifting gets done in the MASTER when it hits the disk. And if you kick those saves on the master with performBlock, then main thread just sends off the request, and returns immediately.
Please let me know how it goes!

Concurrency with core data

I am using multi-threading to get data, parse it, create objects and store them. And after this is all done, I want the window to be shown.
But now I have 2 issues:
I have a deadlock
My barrier does not act as a barrier.
I think the deadlock is because I am updating the managedObjectContext in several threads at once.
So I changed my managedObjectContext with the ConcurrencyType:
__managedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
And created an importContext for the concurrency queue and assigned the parentContext:
NSManagedObjectContext *importContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
importContext.parentContext = self.managedObjectContext;
And put my operations in a performBlock for the importContext:
[importContext performBlock:^{
dispatch_async(backgroundQueue, ^{
[myObject methodAWithContext:importContext];
});
dispatch_async(backgroundQueue, ^{
[myObject methodBWithContext:importContext];
});
dispatch_async(backgroundQueue, ^{
[myObject methodCWithContext:importContext];
});
dispatch_barrier_async(backgroundQueueM, ^{
// create barrier to wait for the first 3 threads to be completed.
dispatch_async(dispatch_get_main_queue(), ^{
// Save the data from the importContext tot the main context on the main queue
NSError *importError = nil;
[importContext save:&importError];
[importContext.parentContext performBlock:^{
NSError *parentError = nil;
[importContext.parentContext save:&parentError];
}];
[self.window makeKeyAndVisible];
});
});
}];
Approach 1:
In each method, I select a subset of object, delete these and then create new objects and save this.
(I thought the delete was quicker than doing a fetch and check for the existence for every object to be created).
So:
In Method A I select all AObjects, delete them and create new AObjects.
In Method B I select all BObjects, delete them and create new BObjects.
In Method C I select all CObjects, delete them and create new CObjects.
But then I get an error "An NSManagedObjectContext cannot delete objects in other contexts".
So approach 2:
I removed the delete. But now I get various different errors.....
And the barrier does not wait for the other threads to be executed.
Q1: What am I doing wrong?
Q2: how do I get the barrier to wait for the 3 threads to be completed
Q3: how can I delete / purge objects on various threads?
(I have read the Apple release notes and doc's, but I can't find this a clear explanation on the combination for multithreading and managedContext.)
You cannot call dispatch_async within performBlock. A managed object context of type NSPrivateQueueConcurrencyType has it's own dispatch queue for executing the operations.
You try to do several operations in parallel by moving them to a different dispatch queue, but that is not possible.
If you really have to do multiple operations in parallel, you must create a private concurrency type MOC for each operation.
ADDED:
There are several ways to wait for all operations to complete:
You could increment a counter at the end of each performBlock: and check if it's value is (in your example) 3.
You could create a semaphore (dispatch_semaphore_create) for each operation with initial value zero, wait for all the semaphores (dispatch_semaphore_wait) and signal the semaphore at the end of each performBlock.
And I am sure that there are better/more elegant/more sophisticated ways to do this.
BUT: As I re-read your question, I see that you try to delay the
[self.window makeKeyAndVisible];
until all Core Data fetch operations have completed. This is not a good design, because the user will see nothing until your data import is done.
A better design is to show an initial view immediately, and update that view when the background operations have fetched data.

Implementing Fast and Efficient Core Data Import on iOS 5

Question: How do I get my child context to see changes persisted on the parent context so that they trigger my NSFetchedResultsController to update the UI?
Here's the setup:
You've got an app that downloads and adds lots of XML data (about 2 million records, each roughly the size of a normal paragraph of text) The .sqlite file becomes about 500 MB in size. Adding this content into Core Data takes time, but you want the user to be able to use the app while the data loads into the data store incrementally. It's got to be invisible and imperceptible to the user that large amounts of data are being moved around, so no hangs, no jitters: scrolls like butter. Still, the app is more useful, the more data is added to it, so we can't wait forever for the data to be added to the Core Data store. In code this means I'd really like to avoid code like this in the import code:
[[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.25]];
The app is iOS 5 only so the slowest device it needs to support is an iPhone 3GS.
Here are the resources I've used so far to develop my current solution:
Apple's Core Data Programming Guide: Efficiently Importing Data
Use Autorelease Pools to keep the memory down
Relationships Cost. Import flat, then patch up relationships at the end
Don't query if you can help it, it slows things down in an O(n^2) manner
Import in Batches: save, reset, drain and repeat
Turn off the Undo Manager on import
iDeveloper TV - Core Data Performance
Use 3 Contexts: Master, Main and Confinement context types
iDeveloper TV - Core Data for Mac, iPhone & iPad Update
Running saves on other queues with performBlock makes things fast.
Encryption slows things down, turn it off if you can.
Importing and Displaying Large Data Sets in Core Data by Marcus Zarra
You can slow down the import by giving time to the current run loop,
so things feel smooth to the user.
Sample Code proves that it is possible to do large imports and keep the UI responsive, but not as fast as with 3 contexts and async saving to disk.
My Current Solution
I've got 3 instances of NSManagedObjectContext:
masterManagedObjectContext - This is the context that has the NSPersistentStoreCoordinator and is responsible for saving to disk. I do this so my saves can be asynchronous and therefore very fast. I create it on launch like this:
masterManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[masterManagedObjectContext setPersistentStoreCoordinator:coordinator];
mainManagedObjectContext - This is the context the UI uses everywhere. It is a child of the masterManagedObjectContext. I create it like this:
mainManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
[mainManagedObjectContext setUndoManager:nil];
[mainManagedObjectContext setParentContext:masterManagedObjectContext];
backgroundContext - This context is created in my NSOperation subclass that is responsible for importing the XML data into Core Data. I create it in the operation's main method and link it to the master context there.
backgroundContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSConfinementConcurrencyType];
[backgroundContext setUndoManager:nil];
[backgroundContext setParentContext:masterManagedObjectContext];
This actually works very, VERY fast. Just by doing this 3 context setup I was able to improve my import speed by over 10x! Honestly, this is hard to believe. (This basic design should be part of the standard Core Data template...)
During the import process I save 2 different ways. Every 1000 items I save on the background context:
BOOL saveSuccess = [backgroundContext save:&error];
Then at the end of the import process, I save on the master/parent context which, ostensibly, pushes modifications out to the other child contexts including the main context:
[masterManagedObjectContext performBlock:^{
NSError *parentContextError = nil;
BOOL parentContextSaveSuccess = [masterManagedObjectContext save:&parentContextError];
}];
Problem: The problem is that my UI will not update until I reload the view.
I have a simple UIViewController with a UITableView that is being fed data using a NSFetchedResultsController. When the Import process completes, the NSFetchedResultsController see's no changes from the parent/master context and so the UI doesn't automatically update like I'm used to seeing. If I pop the UIViewController off the stack and load it again all the data is there.
Question: How do I get my child context to see changes persisted on the parent context so that they trigger my NSFetchedResultsController to update the UI?
I have tried the following which just hangs the app:
- (void)saveMasterContext {
NSNotificationCenter *notificationCenter = [NSNotificationCenter defaultCenter];
[notificationCenter addObserver:self selector:#selector(contextChanged:) name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
NSError *error = nil;
BOOL saveSuccess = [masterManagedObjectContext save:&error];
[notificationCenter removeObserver:self name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
}
- (void)contextChanged:(NSNotification*)notification
{
if ([notification object] == mainManagedObjectContext) return;
if (![NSThread isMainThread]) {
[self performSelectorOnMainThread:#selector(contextChanged:) withObject:notification waitUntilDone:YES];
return;
}
[mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
}
You should probably save the master MOC in strides as well. No sense having that MOC wait until the end to save. It has its own thread, and it will help keep memory down as well.
You wrote:
Then at the end of the import process, I save on the master/parent
context which, ostensibly, pushes modifications out to the other child
contexts including the main context:
In your configuration, you have two children (the main MOC and the background MOC), both parented to the "master."
When you save on a child, it pushes the changes up into the parent. Other children of that MOC will see the data the next time they perform a fetch... they are not explicitly notified.
So, when BG saves, its data is pushed to MASTER. Note, however, that none of this data is on disk until MASTER saves. Furthermore, any new items will not get permanent IDs until the MASTER saves to disk.
In your scenario, you are pulling the data into the MAIN MOC by merging from the MASTER save during the DidSave notification.
That should work, so I'm curious as to where it is "hung." I will note, that you are not running on the main MOC thread in the canonical way (at least not for iOS 5).
Also, you probably only are interested in merging changes from the master MOC (though your registration looks like it is only for that anyway). If I were to use the update-on-did-save-notification, I'd do this...
- (void)contextChanged:(NSNotification*)notification {
// Only interested in merging from master into main.
if ([notification object] != masterManagedObjectContext) return;
[mainManagedObjectContext performBlock:^{
[mainManagedObjectContext mergeChangesFromContextDidSaveNotification:notification];
// NOTE: our MOC should not be updated, but we need to reload the data as well
}];
}
Now, for what may be your real issue regarding the hang... you show two different calls to save on the master. the first is well protected in its own performBlock, but the second is not (though you may be calling saveMasterContext in a performBlock...
However, I'd also change this code...
- (void)saveMasterContext {
NSNotificationCenter *notificationCenter = [NSNotificationCenter defaultCenter];
[notificationCenter addObserver:self selector:#selector(contextChanged:) name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
// Make sure the master runs in it's own thread...
[masterManagedObjectContext performBlock:^{
NSError *error = nil;
BOOL saveSuccess = [masterManagedObjectContext save:&error];
// Handle error...
[notificationCenter removeObserver:self name:NSManagedObjectContextDidSaveNotification object:masterManagedObjectContext];
}];
}
However, note that the MAIN is a child of MASTER. So, it should not have to merge the changes. Instead, just watch for the DidSave on the master, and just refetch! The data is sitting in your parent already, just waiting for you to ask for it. That's one of the benefits of having the data in the parent in the first place.
Another alternative to consider (and I'd be interested to hear about your results -- that's a lot of data)...
Instead of making the background MOC a child of the MASTER, make it a child of the MAIN.
Get this. Every time the BG saves, it automatically gets pushed into the MAIN. Now, the MAIN has to call save, and then the master has to call save, but all those are doing is moving pointers... until the master saves to disk.
The beauty of that method is that the data goes from the background MOC straight into your applications MOC (then passes through to get saved).
There is some penalty for the pass-through, but all the heavy lifting gets done in the MASTER when it hits the disk. And if you kick those saves on the master with performBlock, then main thread just sends off the request, and returns immediately.
Please let me know how it goes!

Multithreaded Core Data - NSManagedObject invalidated

As the title suggests im working with a Core Data Application which gets filled with objects in different background threads (XML Parsing)
In my background thread I'm doing this
managedContext = [[NSManagedObjectContext alloc] init];
[managedContext setUndoManager:nil];
[managedContext setPersistentStoreCoordinator: [[DataManager sharedManager] persistentStoreCoordinator]];
NSNotificationCenter *nc = [NSNotificationCenter defaultCenter];
[nc addObserver:self
selector:#selector(mergeChanges:)
name:NSManagedObjectContextDidSaveNotification
object:managedContext];
NSMutableArray *returnSource = [[self parseDocument:doc] retain];
[managedContext save:&error];
if (error) {
NSLog(#"saving error in datafeed");
}
[managedContext reset];
[self performSelectorOnMainThread:#selector(parseCompleteWithSource:) withObject:returnSource waitUntilDone:YES];
The Merge method looks like this:
NSManagedObjectContext *mainContext = [[DataManager sharedManager] managedObjectContext];
// Merge changes into the main context on the main thread
[mainContext performSelectorOnMainThread:#selector(mergeChangesFromContextDidSaveNotification:)
withObject:notification
waitUntilDone:YES];
[[NSNotificationCenter defaultCenter] removeObserver:self];
I think the merge is successful but as i want to display it in an UITableView it always tells me that my objects are invalidated which is to be expected because of
[managedContext reset];
What i want to do is show the Items which are currently in the database, in the background parse the xml and if thats finished i want to update the UITableView with the new / updated objects. How would I do that, can i "update" the objects to the other Context somehow or is the merge not working correctly?
Do I need to define something specific in the Main ObjectContext?
I've tried different mergepolicies without any luck.
Hope you can help me, thanks!
I believe your problem is the contents of the returnSource array. If that is a bunch of NSManagedObject instances then they will have been created on the background thread by the background thread context.
You call to -[NSManagedObjectContext reset] will invalidate them, since that is what you explicitly tell the context to do. But that is not the big problem.
You then go on to send the array to a the main thread, passing NSManagedObjectinstances over thread borders, and between contexts is a big no-no.
What you need to do is:
Create an array with the NSManagedObjectIDs of the NSManagedObject.
Send the object ID array over thread boundry.
Recreate an array with NSManagedObjects from the managed object IDs on the new thread with it's context.
I have made some Core Data helpers, following the rule of three (the third time you write something, make it general).
Most importantly I have hidden the complexity of managing different managed object contexts for each thread, handling notifications, and all that junk. Instead I have introduced the concept of thread local contexts. Basically lazily created NSManagedObjectContext instances that automatically registers for updates and cleanup when the current thread exits.
A normal use-case:
NSManagedObjectCotext* context = [NSManagedObjectCotext threadLocalContext];
// Do your stuff!!
NSError* error = nil;
if (![context saveWithFailureOption:NSManagedObjectContextCWSaveFailureOptionThreadDefault
error:&error])
{
// Handle error.
}
The full source code, including a sample app for parsing the news RSS from apple.com and store it in Core Data, is available here: https://github.com/jayway/CWCoreData.
There is no reason in this case to call reset on the background context because it will disappear anyway with the background thread and you never use it again in any case. You usually use reset with undo management when you want the context to forget previous steps.
Your major problem here is that your background thread is configured to receive notifications from the managed object context it creates. That is rather pointless.
Instead, you need to register the tableview controller (on the front thread) to receive the NSManagedObjectContextDidSaveNotification from the context on background thread. That way, when the background context saves, the front/main context will update itself with the new data which will trigger an update of the tableview.

Resources