Merging main and private contexts with Core Data

Merging main and private contexts with Core Data - ios

I'm creating temportary contexts in a private queue to asynchronously update the data I persist with Core Data:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Notify update to user
});
}];
Then, once I've got the new data, I need to merge the changes with my main context. I know this is a common scenario, but I'm not sure how should I proceed... This Apple's Core Data documentation section talks about setting a merge policy and I don't fully understand the way to handle that. On the other hand, I found this link, where my scenario is described in its "Stack #2" section and what it says looks simpler, and it doesn't talk about merge policies at all...
Which the correct or most appropriate way should be? I'd really appreciate an explanation and/or simple example of how to manage this scenario.
Thanks in advance.

What you have there looks pretty good.
You are using a private queue to do your work, and it's being saved to the persistent store.
If you only have a small number of changes, then you will be fine. In that case, you want to handle the NSManagedObjectContextDidSaveNotification for your context, and merge the changes into your other context with
[context mergeChangesFromContextDidSaveNotification:notification];
However, if you are really doing a lot of changes, you probably want a separate persistent store coordinator (attached to the same store). By doing this, you can write to the store, while MOCs on the other PSC are reading. If you share the PSC with other readers, only one will get access at a time and you could cause the readers to block until your write has finished.
Also, if you are doing lots of changes, make sure you do them in small batches, inside an autoreleasepool, saving after each batch. Take a look at this similar question: Where should NSManagedObjectContext be created?
Finally, if you are doing lots of changes, it may be more efficient to just refetch your data than it will be to process all the merges. In that case, you don't need to observe the notification, and you don't need to do the merge. It's pretty easy. Note, that if you do this, you really should have a separate PSC... especially if you want to save and notify in small-ish batches.
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Refetch the data you want... if on iOS, this is likely as simple
// as telling the fetched results controller to refetch, and
// reloading your table view (or whatever else is using the data).
});
}];

Related

iOS: How do Core Data Merge policies affect NSManagedObjectContext Save and Refresh operations

I have been reading about merge policies and it gives me conflicting ideas about when changes are merged.
I am having two contexts - one in background queue and one in main queue. Both have policies defined NSOverwriteMergePolicy which I think is outright wrong. I am already having problems in sync. I often witness that in background sync from server data my NSManagedObjects are often outdated, and I end up saving them, causing loss of data.
Is there any place I could visit all the rules & order of precedence for context save, context refresh with respect to overriding policies?
I have seen all the documentation around merge policies but I am confused whether they take effect upon SAVE or REFRESH.
Also, up to some extent, this is very confusing. For example, Apple Docs state this for NSMergeByPropertyObjectTrumpMergePolicy:
A policy that merges conflicts between the persistent store's version
of the object and the current in-memory version by individual
property, with the external changes trumping in-memory changes.
The merge occurs by individual property.
For properties that have been changed in both the external source and in memory, the in-memory
changes trump the external ones.
How to ensure that my desired properties get modified / remain unaffected when I choose to modify / not modify them on different contexts?

TL:DR: Merge policy is evil. You should only write to core-data in one synchronous way.
Full explanation: If an object is changed at the same time from two different contexts core-data doesn't know what to do. You can set a mergePolicy to set which change should win, but that really isn't a good solution, because you will lose data that way. The way that a lot of pros have been dealing with the problem for a long time was to have an operation queue to queue the writes so there is only one write going on at a time, and have another context on the main thread only for reads. This way you never get any merge conflicts. (see https://vimeo.com/89370886 for a great explanation on this setup).
Making this setup with NSPersistentContainer is very easy. In your core-data manager create a NSOperationQueue
_persistentContainerQueue = [[NSOperationQueue alloc] init];
_persistentContainerQueue.maxConcurrentOperationCount = 1;
And do all writing using this queue:
- (void)enqueueCoreDataBlock:(void (^)(NSManagedObjectContext* context))block{
void (^blockCopy)(NSManagedObjectContext*) = [block copy];
[self.persistentContainerQueue addOperation:[NSBlockOperation blockOperationWithBlock:^{
NSManagedObjectContext* context = self.persistentContainer.newBackgroundContext;
[context performBlockAndWait:^{
blockCopy(context);
[context save:NULL]; //Don't just pass NULL here. look at the error and log it to your analytics service
}];
}]];
}
When you call enqueueCoreDataBlock the block is enqueued to ensures that there are no merge conflicts. But if you write to the viewContext that would defeat this setup. Likewise you should treat any other contexts that you create (with newBackgroundContext or with performBackgroundTask) as readonly because they will also be outside of the writing queue.

What is the best practice to pass object references between two queues with CoreData?

I am facing a decision to be made for an applications architecture design.
The Application uses CoreData to persist user information, the same information is also stored on a remote server accessible by a REST-Interface. When the Application starts I provide the cached information from CoreData to be displayed, while I fetch updates from the server. The fetched information is persisted automatically as well.
All of these tasks are performed in background queues as to not block the main thread. I am keeping a strong reference to my persistenContainer and my NSManagedObject called User.
#property (nonatomic, retain, readwrite) User *fetchedLoggedInUser;
As I said the User is populated performing a fetch request via
[_coreDataManager.persistentContainer performBackgroundTask:^(NSManagedObjectContext * _Nonnull context) {
(...)
NSArray <User*>*fetchedUsers = [context executeFetchRequest:fetchLocalUserRequest error:&fetchError];
(...)
self.fetchedLoggedInUser = fetchedUsers.firstObject;
//load updates from server
Api.update(){
//update the self.fetchedLoggedInUser properties with data from the server
(...)
//persist the updated data
if (context.hasChanges) {
context.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
NSError *saveError = nil;
BOOL saveSucceeded = [context save:&saveError];
if (saveSucceeded) {
//notify the App about the updates
//**here is the problem**
}
};
}];
So the obvious thing about this is, that after performing the backgroundTask, my self.fetchedLoggedInUser is not in memory anymore, because of its weak reference to the NSManagedObjectContext provided by the performBackgroundTask() of my PersistentContainer.
Therefore, if I try to access the information from another Model, the values are nil.
What would be the best practice to keep the fetched ManagedObject in Memory and not have to fetch it again, every time I want to access its values?
A) In the Documentation, Apple suggests using the objectID of an ManagedObject, to pass objects between queues
Passing References Between Queues
NSManagedObject instances are not intended to be passed between
queues. Doing so can result in corruption of the data and termination
of the application. When it is necessary to hand off a managed object
reference from one queue to another, it must be done through
NSManagedObjectID instances.
You retrieve the managed object ID of a managed object by calling the
objectID method on the NSManagedObject instance.
The perfectly working code for that situation would be to replace the if(saveSucceeded) Check with this Code:
if (saveSucceeded) {
dispatch_async(dispatch_get_main_queue(), ^{
NSError *error = nil;
self.fetchedLoggedInUser = [self.coreDataManager.persistentContainer.viewContext existingObjectWithID:_fetchedLoggedInUser.objectID error:&error];
(...)
//notify the App about the updates
});
}
But I think this may not be the best solution here, as this needs to access the mainContext (in this case the persistentContainer's viewContext) on the mainQueue. This is likely contradictory to what I am trying to do here (performing on the background, to achieve best performance).
My other options (well, these, that I came up with) would be
B) to either store the user information in a Singleton and update it every time the information is fetched from and saved to CoreData. In this scenario I wouldn't need to worry about keeping the NSManagedObject context alive. I could perform any updates on a private background context provided by my persistentContainer's performBackgroundTask and whenever I'd need to persist new / edited user information I could refetch the NSManagedObject from the database, set the properties, save my context and then update the Singleton. I don't know if this is elegant though.
C) edit the getter Method of my self.fetchedLoggedInUser to contain a fetch request and fetch the needed information (this is probably the worst, because of the overhead when accessing the database) and I am not even sure if this would work at all.
I hope that one of these solutions is actually best practice, but I'd like to hear your suggestions why/how or why/how not to handle the passing of the information.
TL:DR; Whats the best practice to keep user information available throughout the whole app, when loading and storing new information is mostly done from backgroundQueues?
PS: I don't want to fetch the information every time I need to access it in one of my ViewControllers, I want to store the data on a central knot, so that it is accessible from every ViewController with ease. Currently the self.fetchedLoggedInUser is a property of a singleton used throughout the application. I find that this saves a lot of redundant code, using the Singleton makes loading and storing the information clearer and reduces the access count to the database. If this is considered bad practice I'd be happy to discuss about that with you.

Use a NSFetchedResultsController - they are very efficient and you can use them even for one object. A FetchedResultsController does a fetch once and then monitors core data for changes. When it changes you have a callback that it has changed. It also works perfectly for ANY core-data setup. So long as the changes are propagated to the main context (either with newBackgroundContext or performBackgroundTask or child contexts or whatever) the fetchedResultsController will update. So you are free to change your core-data stack without changes your monitoring code.
In general I don't like keeping pointers to ManagedObjects. If the entry is deleted from database then the managedObject will crash when you try to access it. A fetchedResultsController is always safe to read fetchedObjects as it tracks deletions for you.
Obviously attach the NSFetchedResultsController to the viewContext and only read it from the main thread.

I came up with a very elegant solution in my opinion.
From the beginning I was using a Singleton called sharedCoreDataManager, I added a property backgroundContext that is initialized like so
self.backgroundContext = _persistentContainer.newBackgroundContext;
_backgroundContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
_backgroundContext.retainsRegisteredObjects = YES;
and is retained by sharedCoreDataManager. I am using this context to perform any tasks. Through calling _backgroundContext.retainsRegisteredObjects my NSManagedObject is retained by the backgroundContext, which is itself (like I said) retained by my Singleton sharedCoreDataManager.
I think this is an elegant solution as I can access the ManagedObject threadsafe from the background anytime. I also won't need any extra class that holds the user information on top. And on top of that I can easily edit the user information at anytime and then call save() on my backgroundContext if needed.
Maybe I am going to add it as a child to my viewContext in the future, I'll evaluate the performance and eventually update this answer.
You are still welcome to propose a better solution or discuss this topic.

Merging / copying different Core Data contexts

I've been reading several posts related to this, but I'm still not sure of how should I handle my scenario: I have a "root" context (the provided in AppDelegate by default) where I insert the objects I use and show throughout the app (this context is intended for read-only operations). My object's data come from several Web Services I periodically call, and whenever I need to call them and update my objects, I create a new context in a private queue to request and parse them:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
[privateContext performBlock: ^{
// Call services and parse results
// Insert objects in context
// Save context
});
}];
When this finishes, I need to transfer the updates to my "root" context to keep showing it thoughout the app. I don't know how should I do this, I've been thiking about some options:
To "clean" the root context and fetch again all the objects.
To "clean" the root context and copy/duplicate somehow the private context to it
To manage some kind of merge policy
This last point sounds to be the most appropriate one, if there is not a better one... is there? Could somebody give me a clear example / tutorial / sample code of a similar scenario? I don't fully understand yet concurrency in Core Data.
Thanks in advance

Is asynchronous Core Data needed in most cases?

I've learned that generally, intensive tasks should take place on background threads, as if they're on the main thread they'll block user interaction and interface updates.
Does Core Data fall under that umbrella? I received a great answer to my question about loading images asynchronously in a UITableView, but I'm curious how to then work with Core Data as the backend.
I know Apple has a Core Data Concurrency guide, but I'm curious in which cases one is supposed to use Core Data in the background.
As a quick example, say I'm making a Twitter client and want to get all the tweet information (tweet text, username, user avatar, linked images, etc.). I asynchronously download that information and receive some JSON from the Twitter API that I then parse. Should I then do a dispatch_async(dispatch_get_main_queue()...) when I add the information to Core Data?
I also then want to download the user's avatar, but I want to do that separately from presenting the tweet, so I can present the tweet quickly as possible, then present the image when ready. So I asynchronously download the image. Should I then update that Core Data item asynchronously?
Am I in a situation where I don't need multi-threaded Core Data at all? If so, when would be a situation where I need it?

I've learned that generally, intensive tasks should take place on background threads, as if they're on the main thread they'll block user interaction and interface updates.
Does Core Data fall under that umbrella?
Yes, actually.
Core Data tasks can and should - where possible - be executed on background threads or on non-main queues.
It's important to note though, that each managed object is associated to a certain execution context (a thread or a dispatch queue). Accessing a managed object MUST be executed only from within this execution context. This association comes from the fact that each managed object is registered with a certain Managed Object Context, and this managed object context is associated to a certain execution context when it is created.
See also:
[NSManagedObjectContext initWithConcurrencyType](),
[NSManagedObjectContext performBlock]
[NSManagedObjectContext performBlockAndWait]
Consequently, when displaying properties of managed objects, this involves UIKit, and since UIKit methods MUST be executed on the main thread, the managed object's context must be associated to the main thread or main queue.
Otherwise, Core Data and user code can access managed objects from arbitrary execution contexts, as long as this is the one to which the managed object is associated with.
The below picture is an Instrument Time Profile which shows quite clearly how Core Data tasks can be distributed on different threads:
The highlighted "spike" in the CPU activity shows a task which performs the following:
Load 1000 JSON objects from origin (this is just a local JSON file),
Create managed objects in batches of 100 on a private context
Merge them into the Cora Data Stack main context
Save the context (each batch)
Finally, fetch all managed objects into the main context
As we can see, only 26.4% of the CPU load will be executed on the main thread.

Core Data does indeed fall under that umbrella. Particularly for downloading and saving data, but also possibly for fetching depending on the number of objects in the data store and the predicate to be applied.
Generally, I'd push all object creation and saving which is coming from a server onto a background thread. I'd only update and save objects on the main thread if they're user generated (because it would only be one, updated slowly and infrequently).
Downloading your twitter data is a good example as there will potentially be a good amount of data to process. You should process and save the data on a background thread and save it up to the persistent store there. The persistent store should then merge the changes down to the main thread context for you (assuming you have the contexts configured nicely - use a 3rd party framework for that like MagicalRecord).
Again, for the avatar update, you're already on a background thread so you might as well stay there :-)
You might not need to use multiple threads now. If you only download the 5 most recent tweets then you might not notice the difference. But using a framework can make the multi-threading relatively easy. And using a fetched results controller can make knowing when the UI should be updated on the main thread very easy. So, it's worthwhile taking the time to understand the setup and using it.

The best way to handle behavior like this is to use multiple NSManagedObjectContexts. The "main" context you create like so:
_mainManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
_mainManagedObjectContext.persistentStoreCoordinator = [self mainPersistentStoreCoordinator];
You're going to want to do any heavy writes on a different NSManagedObjectContext to avoid blocking your UI thread as you import (which can be quite noticeable on large or frequent operations to your Main context).
To achieve this, you would do something like this:
Retrieve your data asynchronously from the network
Spin up a temporary (or perhaps a permanent) background NSManagedObjectContext and set the "parent" to the main ManagedObjectContext (so it will merge the data you import when you save)
Use the performBlock: or performBlockAndWait: APIs to write to that context on its own private queue (in the background)
Example (uses AFNetworking):
[_apiClient GET:[NSString stringWithFormat:#"/mydata"]
parameters:nil
success:^(AFHTTPRequestOperation *operation, id responseObject) {
NSError* jsonError;
id jsonResponse = [NSJSONSerialization JSONObjectWithData:operation.responseData options:kNilOptions error:&jsonError];
if (!jsonError) {
NSArray* parsedData = (NSArray*)jsonResponse;
completionBlock(parsedShares, nil);
} else {
NSManagedObjectContext *context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
context.parentContext = YOUR_MAIN_CONTEXT; // when you save this context, it will merge its changes into your main context!
// the following code will occur in a background queue for you that CoreData manages
[context performBlock:^{
// THIS IS WHERE YOU IMPORT YOUR DATA INTO CORE DATA
// THIS IS WHERE YOU IMPORT YOUR DATA INTO CORE DATA
// THIS IS WHERE YOU IMPORT YOUR DATA INTO CORE DATA
if (![context save:&saveError]) {
NSLog(#"Error saving context: %#", saveError);
} else {
NSLog(#"Saved data import in background!");
}
}];
}
}
failure:^(AFHTTPRequestOperation *operation, NSError *error) {
NSLog(#"There was an error: %#", error)
}];
Docs: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/CoreDataFramework/Classes/NSManagedObjectContext_Class/NSManagedObjectContext.html#//apple_ref/occ/instm/NSManagedObjectContext/performBlockAndWait:

As I alluded to in your other question, it's largely going to be a matter of how long the manipulations are going to take. If the computation required doesn't have a noticeable delay, by all means be lazy about and do your core data on the main thread.
In your other example, you're requesting 20 items from twitter, parsing 20 items and sticking them into CoreData isn't going to be noticeable. The best approach here is to probably continue to just fetch 20 at a time and update in the foreground as each chunk becomes available.
Downloading all items from twitter in one request will take a significant amount of time and computation and it's probably worth creating a separate ManagedObjectModel and synchronizing it with the foreground model. Since you really have one-way data (it always flows twitter->core data->user interface) the likelihood of clashes is minimal, so you can easily use NSManagedObjectContextDidSaveNotification and mergeChangesFromContextDidSaveNotification:

Core Data multi thread application

I'm trying to use core data in a multi thread way.
I simply want to show the application with the previously downloaded data while downloading new data in background.
This should let the user access the application during update process.
I have a NSURLConnection which download the file asyncronously using delegate (and showing the progress), then i use an XMLParser to parse the new data and create new NSManagedObjects in a separate context, with its own persistentStore and using a separate thread.
The problem is that creating new objects in the same context of the old one while showing it can throws BAD_INSTRUCTION exception.
So, I decided to use a separate context for the new data, but I can't figure out a way to move all the objects to the other context once finished.
Paolo aka SlowTree

The Apple Concurrency with Core Data documentation is the place to start. Read it really carefully... I was bitten many times by my misunderstandings!
Basic rules are:
Use one NSPersistentStoreCoordinator per program. You don't need them per thread.
Create one NSManagedObjectContext per thread.
Never pass an NSManagedObject on a thread to the other thread.
Instead, get the object IDs via -objectID and pass it to the other thread.
More rules:
Make sure you save the object into the store before getting the object ID. Until saved, they're temporary, and you can't access them from another thread.
And beware of the merge policies if you make changes to the managed objects from more than one thread.
NSManagedObjectContext's -mergeChangesFromContextDidSaveNotification: is helpful.
But let me repeat, please read the document carefully! It's really worth it!

Currently [May 2015] the Apple Concurrency with Core Data documentation is, at best, very misleading as it doesn't cover any of the enhancements in iOS 5 and hence no longer shows the best ways to use core data concurrently. There are two very important changes in iOS 5 - parent contexts and new concurrency/threading types.
I have not yet found any written documentation that comprehensively covers these new features, but the WWDC 2012 video "Session 214 - Core Data Best Practices" does explain it all very well.
Magical Record uses these new features and may be worth a look.
The real basics are still the same - you can still only use managed objects the thread their managed object context was created on.
You can now use [moc performBlock:] to run code on the right thread.
There's no need to use mergeChangesFromContextDidSaveNotification: anymore; instead create a child context to make the changes, then save the child context. Saving the child context will automatically push the changes up into the parent context, and to save the changes to disk just perform a save on the parent context in it's thread.
For this to work you must create the parent context with a concurrent type, eg:
mainManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
Then on the background thread:
context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSConfinementConcurrencyType];
[context setParentContext:mainManagedObjectContext];
<... perform actions on context ...>
NSError *error;
if (![context save:&error])
{
<... handle error ...>
}
[mainManagedObjectContext performBlock:^{
NSError *e = nil;
if (![mainContext save:&e])
{
<... handle error ...>
}
}];

I hope this can help all the peoples that meet problems using core data in a multithread environment.
Take a look at "Top Songs 2" in apple documentation. With this code i took the "red pill" of Matrix, and discovered a new world, without double free error, and without faults. :D
Hope this helps.
Paolo
p.s.
Many thanks to Yuji, in the documentation you described above I found that example.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart