iOS: How do Core Data Merge policies affect NSManagedObjectContext Save and Refresh operations - ios

I have been reading about merge policies and it gives me conflicting ideas about when changes are merged.
I am having two contexts - one in background queue and one in main queue. Both have policies defined NSOverwriteMergePolicy which I think is outright wrong. I am already having problems in sync. I often witness that in background sync from server data my NSManagedObjects are often outdated, and I end up saving them, causing loss of data.
Is there any place I could visit all the rules & order of precedence for context save, context refresh with respect to overriding policies?
I have seen all the documentation around merge policies but I am confused whether they take effect upon SAVE or REFRESH.
Also, up to some extent, this is very confusing. For example, Apple Docs state this for NSMergeByPropertyObjectTrumpMergePolicy:
A policy that merges conflicts between the persistent store's version
of the object and the current in-memory version by individual
property, with the external changes trumping in-memory changes.
The merge occurs by individual property.
For properties that have been changed in both the external source and in memory, the in-memory
changes trump the external ones.
How to ensure that my desired properties get modified / remain unaffected when I choose to modify / not modify them on different contexts?

TL:DR: Merge policy is evil. You should only write to core-data in one synchronous way.
Full explanation: If an object is changed at the same time from two different contexts core-data doesn't know what to do. You can set a mergePolicy to set which change should win, but that really isn't a good solution, because you will lose data that way. The way that a lot of pros have been dealing with the problem for a long time was to have an operation queue to queue the writes so there is only one write going on at a time, and have another context on the main thread only for reads. This way you never get any merge conflicts. (see https://vimeo.com/89370886 for a great explanation on this setup).
Making this setup with NSPersistentContainer is very easy. In your core-data manager create a NSOperationQueue
_persistentContainerQueue = [[NSOperationQueue alloc] init];
_persistentContainerQueue.maxConcurrentOperationCount = 1;
And do all writing using this queue:
- (void)enqueueCoreDataBlock:(void (^)(NSManagedObjectContext* context))block{
void (^blockCopy)(NSManagedObjectContext*) = [block copy];
[self.persistentContainerQueue addOperation:[NSBlockOperation blockOperationWithBlock:^{
NSManagedObjectContext* context = self.persistentContainer.newBackgroundContext;
[context performBlockAndWait:^{
blockCopy(context);
[context save:NULL]; //Don't just pass NULL here. look at the error and log it to your analytics service
}];
}]];
}
When you call enqueueCoreDataBlock the block is enqueued to ensures that there are no merge conflicts. But if you write to the viewContext that would defeat this setup. Likewise you should treat any other contexts that you create (with newBackgroundContext or with performBackgroundTask) as readonly because they will also be outside of the writing queue.

Related

What is the best practice to pass object references between two queues with CoreData?

I am facing a decision to be made for an applications architecture design.
The Application uses CoreData to persist user information, the same information is also stored on a remote server accessible by a REST-Interface. When the Application starts I provide the cached information from CoreData to be displayed, while I fetch updates from the server. The fetched information is persisted automatically as well.
All of these tasks are performed in background queues as to not block the main thread. I am keeping a strong reference to my persistenContainer and my NSManagedObject called User.
#property (nonatomic, retain, readwrite) User *fetchedLoggedInUser;
As I said the User is populated performing a fetch request via
[_coreDataManager.persistentContainer performBackgroundTask:^(NSManagedObjectContext * _Nonnull context) {
(...)
NSArray <User*>*fetchedUsers = [context executeFetchRequest:fetchLocalUserRequest error:&fetchError];
(...)
self.fetchedLoggedInUser = fetchedUsers.firstObject;
//load updates from server
Api.update(){
//update the self.fetchedLoggedInUser properties with data from the server
(...)
//persist the updated data
if (context.hasChanges) {
context.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
NSError *saveError = nil;
BOOL saveSucceeded = [context save:&saveError];
if (saveSucceeded) {
//notify the App about the updates
//**here is the problem**
}
};
}];
So the obvious thing about this is, that after performing the backgroundTask, my self.fetchedLoggedInUser is not in memory anymore, because of its weak reference to the NSManagedObjectContext provided by the performBackgroundTask() of my PersistentContainer.
Therefore, if I try to access the information from another Model, the values are nil.
What would be the best practice to keep the fetched ManagedObject in Memory and not have to fetch it again, every time I want to access its values?
A) In the Documentation, Apple suggests using the objectID of an ManagedObject, to pass objects between queues
Passing References Between Queues
NSManagedObject instances are not intended to be passed between
queues. Doing so can result in corruption of the data and termination
of the application. When it is necessary to hand off a managed object
reference from one queue to another, it must be done through
NSManagedObjectID instances.
You retrieve the managed object ID of a managed object by calling the
objectID method on the NSManagedObject instance.
The perfectly working code for that situation would be to replace the if(saveSucceeded) Check with this Code:
if (saveSucceeded) {
dispatch_async(dispatch_get_main_queue(), ^{
NSError *error = nil;
self.fetchedLoggedInUser = [self.coreDataManager.persistentContainer.viewContext existingObjectWithID:_fetchedLoggedInUser.objectID error:&error];
(...)
//notify the App about the updates
});
}
But I think this may not be the best solution here, as this needs to access the mainContext (in this case the persistentContainer's viewContext) on the mainQueue. This is likely contradictory to what I am trying to do here (performing on the background, to achieve best performance).
My other options (well, these, that I came up with) would be
B) to either store the user information in a Singleton and update it every time the information is fetched from and saved to CoreData. In this scenario I wouldn't need to worry about keeping the NSManagedObject context alive. I could perform any updates on a private background context provided by my persistentContainer's performBackgroundTask and whenever I'd need to persist new / edited user information I could refetch the NSManagedObject from the database, set the properties, save my context and then update the Singleton. I don't know if this is elegant though.
C) edit the getter Method of my self.fetchedLoggedInUser to contain a fetch request and fetch the needed information (this is probably the worst, because of the overhead when accessing the database) and I am not even sure if this would work at all.
I hope that one of these solutions is actually best practice, but I'd like to hear your suggestions why/how or why/how not to handle the passing of the information.
TL:DR; Whats the best practice to keep user information available throughout the whole app, when loading and storing new information is mostly done from backgroundQueues?
PS: I don't want to fetch the information every time I need to access it in one of my ViewControllers, I want to store the data on a central knot, so that it is accessible from every ViewController with ease. Currently the self.fetchedLoggedInUser is a property of a singleton used throughout the application. I find that this saves a lot of redundant code, using the Singleton makes loading and storing the information clearer and reduces the access count to the database. If this is considered bad practice I'd be happy to discuss about that with you.
Use a NSFetchedResultsController - they are very efficient and you can use them even for one object. A FetchedResultsController does a fetch once and then monitors core data for changes. When it changes you have a callback that it has changed. It also works perfectly for ANY core-data setup. So long as the changes are propagated to the main context (either with newBackgroundContext or performBackgroundTask or child contexts or whatever) the fetchedResultsController will update. So you are free to change your core-data stack without changes your monitoring code.
In general I don't like keeping pointers to ManagedObjects. If the entry is deleted from database then the managedObject will crash when you try to access it. A fetchedResultsController is always safe to read fetchedObjects as it tracks deletions for you.
Obviously attach the NSFetchedResultsController to the viewContext and only read it from the main thread.
I came up with a very elegant solution in my opinion.
From the beginning I was using a Singleton called sharedCoreDataManager, I added a property backgroundContext that is initialized like so
self.backgroundContext = _persistentContainer.newBackgroundContext;
_backgroundContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy;
_backgroundContext.retainsRegisteredObjects = YES;
and is retained by sharedCoreDataManager. I am using this context to perform any tasks. Through calling _backgroundContext.retainsRegisteredObjects my NSManagedObject is retained by the backgroundContext, which is itself (like I said) retained by my Singleton sharedCoreDataManager.
I think this is an elegant solution as I can access the ManagedObject threadsafe from the background anytime. I also won't need any extra class that holds the user information on top. And on top of that I can easily edit the user information at anytime and then call save() on my backgroundContext if needed.
Maybe I am going to add it as a child to my viewContext in the future, I'll evaluate the performance and eventually update this answer.
You are still welcome to propose a better solution or discuss this topic.

Merging main and private contexts with Core Data

I'm creating temportary contexts in a private queue to asynchronously update the data I persist with Core Data:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Notify update to user
});
}];
Then, once I've got the new data, I need to merge the changes with my main context. I know this is a common scenario, but I'm not sure how should I proceed... This Apple's Core Data documentation section talks about setting a merge policy and I don't fully understand the way to handle that. On the other hand, I found this link, where my scenario is described in its "Stack #2" section and what it says looks simpler, and it doesn't talk about merge policies at all...
Which the correct or most appropriate way should be? I'd really appreciate an explanation and/or simple example of how to manage this scenario.
Thanks in advance.
What you have there looks pretty good.
You are using a private queue to do your work, and it's being saved to the persistent store.
If you only have a small number of changes, then you will be fine. In that case, you want to handle the NSManagedObjectContextDidSaveNotification for your context, and merge the changes into your other context with
[context mergeChangesFromContextDidSaveNotification:notification];
However, if you are really doing a lot of changes, you probably want a separate persistent store coordinator (attached to the same store). By doing this, you can write to the store, while MOCs on the other PSC are reading. If you share the PSC with other readers, only one will get access at a time and you could cause the readers to block until your write has finished.
Also, if you are doing lots of changes, make sure you do them in small batches, inside an autoreleasepool, saving after each batch. Take a look at this similar question: Where should NSManagedObjectContext be created?
Finally, if you are doing lots of changes, it may be more efficient to just refetch your data than it will be to process all the merges. In that case, you don't need to observe the notification, and you don't need to do the merge. It's pretty easy. Note, that if you do this, you really should have a separate PSC... especially if you want to save and notify in small-ish batches.
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Refetch the data you want... if on iOS, this is likely as simple
// as telling the fetched results controller to refetch, and
// reloading your table view (or whatever else is using the data).
});
}];

Core Data Concurrency with parent/child relationship and NSInvocationOperation done right?

I've been using Core Data to store iOS app data in a SQLite DB. Data is also synchronized to a central server during login or after n minutes have passed and as the app grew, synchronization got slower and needed a background process to prevent freezing of the main thread.
For this I rewrote my Synchronization class to use NSInvocationOperation class for each process in the synchronization method. I'm using dependencies as the data being imported/synchronized has several dependent relationships.
Then I add all operations into an NSOperationQueue and let it run.
There are two additional important elements:
When I generate the operations to execute I also create a new UUID NSString, which represents a synchronization ID. This is used to track which Entities in Core Data have been synchronized, therefore, every Entity has an attribute that stores this synchronization ID. It is only used during the synchronization process and is accesses by each method passed to NSInvocationOperation.
The import and synchronization methods are the actual methods that access Core Data. Inside each of those methods I create a new instance of NSManagedObjectContext as follows:
PHAppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[privateContext setParentContext:[appDelegate mainManagedObjectContext]];
using NSPrivateQueueConcurrencyType as opposed to NSMainQueueConcurrencyType which is used in the AppDelegate for the main MOC. Then I create the parent/child relationship using the MOC from the main thread and that's that. When saving the child MOC, I understand that changes propagate back to the main MOC and to the store (SQLite).
I'm fairly new to concurrency in iOS and used this to guide me: http://code.tutsplus.com/tutorials/core-data-from-scratch-concurrency--cms-22131 (see Strategy #2). Different from this article and many other sources that I've seen, I do not subclass NSOperation. Instead, my approach uses NSInvocationOperation in order to pass each method that I want to run concurrently.
My question: Can you take a look at my code and see if you find something that I'm doing wrong or do you think that this will work? So far, I've had no problems running the code. But concurrency is tough to debug, so before we ship this code, I'd like to have more experienced eyes on this.
Thanks in advance!
Synchronization class code on pastebin.com: http://pastebin.com/CUzWw4Tv
(not everything is in the code paste, but it should be enough to evaluate the process, let me know if you need more)

Is it valid to mix the "1 moc per thread" strategy with "child moc" strategy?

Until now I always used a "main moc" for the main thread, initialised like this:
[[NSManagedObjectContext alloc] init];
and then I have NSOperation subclasses with their own moc that import data from the webservice, and I merge in the "main" moc on save observing NSManagedObjectContextDidSaveNotification
But now I need the ability to add "temporary" objects that the user can commit (or not) later. So it looks like a child context is the perfect fit, and in order to use child context I changed the initialization of my "main moc" to
[[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
The question is: can my current structure with NSOperation subclasses with their own moc (initialised without a type in their own thread) have problems if used along with the child context strategy? I don't think so, but I don't find much about mixing those strategies.
Note that I want to maintain the NSOperation subclasses and I don't want to use child contexts also for importing my data, because it suffers on performances, see http://floriankugler.com/blog/2013/4/29/concurrent-core-data-stack-performance-shootout
Moreover, when I create a new child of my main thread (that is of type NSMainQueueConcurrencyType), can I create it that child with type NSMainQueueConcurrencyType, continuing to work with my objects in the main thread as usual? Or am I forced to use NSPrivateQueueConcurrencyType, and use performBlock: for every operation on my objects?
I'm asking because is not clear from the documentation if using 2 moc on the same thread (the main thread in this case) could be a problem.
UPDATE:
Finally I implemented and used this solution on production and there are not problems so far. The only thing that I needed to do is to avoid merging on my NSManagedObjectContextDidSaveNotification notification when the moc has a parentContext (we don't want to merge mocs with a parent context, because they manage the merge themselves, but obviously the notification is triggered also for save on this kind of moc)
Yes, you can have multiple main queue context mocs, for exactly the reason you say - you create a temporary editing context for editing data which is then saved or discarded depending on user action.
As for mixing and matching with your operation queue contexts - that shouldn't be a problem. If you're merging back to the parent context, then any child contexts will pick up that data the next time they fetch.
Actually, that's indicated when working with multiple threads. Here, I've wrote an article exactly about this. The slave mocs mentioned in it, are designed exactly for working with operations, each operations on it's own slave moc.

Core Data multi thread application

I'm trying to use core data in a multi thread way.
I simply want to show the application with the previously downloaded data while downloading new data in background.
This should let the user access the application during update process.
I have a NSURLConnection which download the file asyncronously using delegate (and showing the progress), then i use an XMLParser to parse the new data and create new NSManagedObjects in a separate context, with its own persistentStore and using a separate thread.
The problem is that creating new objects in the same context of the old one while showing it can throws BAD_INSTRUCTION exception.
So, I decided to use a separate context for the new data, but I can't figure out a way to move all the objects to the other context once finished.
Paolo aka SlowTree
The Apple Concurrency with Core Data documentation is the place to start. Read it really carefully... I was bitten many times by my misunderstandings!
Basic rules are:
Use one NSPersistentStoreCoordinator per program. You don't need them per thread.
Create one NSManagedObjectContext per thread.
Never pass an NSManagedObject on a thread to the other thread.
Instead, get the object IDs via -objectID and pass it to the other thread.
More rules:
Make sure you save the object into the store before getting the object ID. Until saved, they're temporary, and you can't access them from another thread.
And beware of the merge policies if you make changes to the managed objects from more than one thread.
NSManagedObjectContext's -mergeChangesFromContextDidSaveNotification: is helpful.
But let me repeat, please read the document carefully! It's really worth it!
Currently [May 2015] the Apple Concurrency with Core Data documentation is, at best, very misleading as it doesn't cover any of the enhancements in iOS 5 and hence no longer shows the best ways to use core data concurrently. There are two very important changes in iOS 5 - parent contexts and new concurrency/threading types.
I have not yet found any written documentation that comprehensively covers these new features, but the WWDC 2012 video "Session 214 - Core Data Best Practices" does explain it all very well.
Magical Record uses these new features and may be worth a look.
The real basics are still the same - you can still only use managed objects the thread their managed object context was created on.
You can now use [moc performBlock:] to run code on the right thread.
There's no need to use mergeChangesFromContextDidSaveNotification: anymore; instead create a child context to make the changes, then save the child context. Saving the child context will automatically push the changes up into the parent context, and to save the changes to disk just perform a save on the parent context in it's thread.
For this to work you must create the parent context with a concurrent type, eg:
mainManagedObjectContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
Then on the background thread:
context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSConfinementConcurrencyType];
[context setParentContext:mainManagedObjectContext];
<... perform actions on context ...>
NSError *error;
if (![context save:&error])
{
<... handle error ...>
}
[mainManagedObjectContext performBlock:^{
NSError *e = nil;
if (![mainContext save:&e])
{
<... handle error ...>
}
}];
I hope this can help all the peoples that meet problems using core data in a multithread environment.
Take a look at "Top Songs 2" in apple documentation. With this code i took the "red pill" of Matrix, and discovered a new world, without double free error, and without faults. :D
Hope this helps.
Paolo
p.s.
Many thanks to Yuji, in the documentation you described above I found that example.

Resources