Sibling NSManagedObjectContexts when using NSOperations - ios

I'm running into some trouble working with CoreData in a multithreaded app using NSOperations. I am using nested ManagedObjectContexts through MagicalRecord (2.0.3) as follows:
Root Context (saves to disk)
|
Main Thread Context (for populating the UI)
|
Sub-Context(s) (used to add/edit/remove data)
I have a single NSOperationQueue to handle all data processing.
For the most part, things work right, I can asynchronously download data, then feed it to an NSOperation which then writes it to one of the sub-contexts. Saving at the end of the operation pushes the changes to the main context and the UI updates. Great!
The problem is that if a sub-context deletes an entity and saves (pushing it to the main context), a sibling sub-context will still think that it exists. So then if a sibling tries to fault the entity and pull it from it's parent (the main context) I get a crash.
I have 2 questions:
Should I use MOC notifications to merge the changes pushed to the main MOC back to it's other children? I this and was getting another crash...
Should I even have mutltiple sub-contexts? MOC's are supposed to be associated with a single thread (MagicalRecord helps automate this for me), and I have a single NSOperationQueue for saving data, so shouldn't I only have 1 sub-context? I've verified that sometimes my saves are performed by different contexts.
I'd appreciate any advice. Thanks.

You can, and should, have multiple sub contexts. However, I'm not sure if the classic "thread isolation mode" of contexts is the model you should have anymore. That is what you're doing when you say that a context should belong to a particular thread. MagicalRecord 2.0x uses private queue contexts now, and as such will behave a tad differently. There are no rules saying that sibling contexts need to stay in sync. You'd have to do that yourself. A very simple solution would be to listen to "did save" notifications on the context you're saving, and either reset, or create a new context on the other thread. You can do this with notifications, or a completion black that MagicalRecord provides.
Hope this helps

Related

Magical Record with concurrency

I'm in the middle of development of an iOS application after working quite some time with Core Data and Magical Record and having an error:
error: NULL _cd_rawData but the object is not being turned into a fault
I didn't know Core Data before this project and as it turns out I was very naive to think I can just work with Magical Record without worrying about concurrency, as I haven't dedicated any thoughts/work about managed contexts for the main thread and background threads.
After A LOT of reading about Core Data Managed Object Contexts and Magical Record, I understand that:
NSManagedObjects are not thread safe.
NSManagedObjectId IS thread safe.
I can use: Entity *localEntity = [entity MR_inContext:localContext] of Magical Record to work with an entity in a background thread's context.
I should use Magical Record's saveWithBlock:completion: and saveWithBlockAndWait: methods to get a managed context to use for background threads.
A little information regarding my application:
I'm using the latest version of Magical Record which is 2.2.
I have a backend server which my application talks to a lot.
Their communication is similar to Whatsapp, as it uses background threads for communicating with the server and updating managed objects upon successful responses.
I'm wrapping the model with DataModel objects that hold the managed objects in arrays for quick referencing of UI/background use.
Now - my questions are:
Should I fetch only from the UI thread? Is it okay that I'm holding the managed objects in DataModel objects?
How can I create a new entity from a background thread and use the newly created entity in the DataModel objects?
Is there a best design scenario that I should use? specifically when sending a request to the server and getting a response, should I create a new managed context and use it throughout the thread's activity?
Let me know if everything is clear. If not, I'll try and add clarity.
Any help or guidelines would be appreciated.
Thanks!
I'm not working with MagicalRecord, but these questions are more related to CoreData than to MagicalRecord, so I'll try to answer them :).
1) Fetching from main(UI) thread
There are many ways how to design app model, so two important things I've learned using CoreData for few years:
when dealing with UI, always fetch objects on main thread. As You correctly stated NSManagedObjects are not thread safe so you can't (well, can, but shouldn't) access their data from different thread. NSFetchedResultsController is Your best friend when you need to display long lists (eg. for messages – but watch out for request's batchSize).
you should design your storage and fetches to be fast and responsive. Use indexes, fetch only needed properties, prefetch relationships etc.
on the other hand if you need to fetch from large amount of data, you can use context on different thread and transfer NSManagedObjectIDs only. Let's say your user has huge amount of messages and you want to show him latest 10 from specific contact. You can create background context (private concurrency), fetch these 10 message IDs (NSManagedObjectIDResultType), store them in array (or any other suitable format for you), return them to your main thread and fetch those IDs only. Note that this approach speed things up if fetch takes long because of predicate/sortDescriptor not if the "problem" is in the process of turning faults into objects (eg. large UIImage stored in transformable attribute :) )
2) Creating entity in background
You can create the object in background context, store it's NSManagedObjectID after you save context (object has only temporary ID before save) and send it back to your main thread, where you can perform fetch by ID and get the object in your main context.
3) Working with background context
I don't know if it's exactly the best, but I'm pretty much satisfied with NSManagedObjectContext observation and merging from notifications. Checkout:
mergeChangesFromContextDidSaveNotification:
So, you create background context, add main context as observer for changes (NSManagedObjectContextObjectsDidChangeNotification) and background context automatically sends you notifications (every time you perform save) about all of it's changes – inserted/updated/deleted objects (no worries, you can just merge it by calling mergeChangesFromContextDidSaveNotification:). This has many advantages, such as:
everything is updated automatically (every object you have fetched in "observing context" gets updated/deleted)
every merge runs in memory (no fetches, no persisting on main thread)
if you implement NSFetchedResultsController's delegate method, everything updates automatically (not exactly everything – see below)
On the other side:
take care about merging policy (NSMangedObjectContext mergePolicy)
watchout for referencing managed objects that got deleted from background (or just another context)
NSFetchedResultsController updates only on changes to "direct" attributes (checkout this SO question)
Well, I hope it answers your questions. If everything comes up, don't hesitate to ask :)
Side note about child contexts
Also take a peek to child contexts. They can be powerful too. Basically every child context sends it's changes to parent context on save (in case of "base" context (no parent context), it sends it's changes to persistent coordinator).
For example when you are creating edit/add controller you can create child context from your main context and perform all changes in it. When user decides to cancel the operation, you just destroy (remove reference) the child context and no changes will be stored. If user decides to accept changes he/she made, save the child context and destroy it. By saving child context all changes are propagated to it's parent store (in this example your main context). Just be sure to also save the parent context (at some point) to persist these changes (save: method doesn't do the bubbling). Checkout documentation of managing parent store.
Happy coding!

Guidance on how to handle background and foreground core data with UI responsiveness

So I've been playing with this "issue" for a long time here and I keep getting into issues:
Here is more or less what I would like for my dataflow
User Segues to Controller
Controller pings FTP site (if up continue)
Loop through each parent record
Parse each record (this is a slow process) turning it and all sub data into a giant JSON structure which is stored in a temp directory
Update the Screen to the user with the progress of the parsing task
Upload the JSON Structure via FTP
Once successfully uploaded update the "isUploaded" field on the record
Now where I've go into all sort of issues is with keeping the UI updated. I'm designing a somewhat simple UI where we have a status bar that shows the process of each JSON file parsing task. I can make things work but once I' try to get a nice responsive UI i've run into all sorts of issues.
I understand I'm supposed to do CoreData on the main thread, but, in doing so my UI doesn't update and becomes unresponsive.
I segue a NSManagedObjectContext over to this controller and I understand from that objectContext I can likely reference the NSPersistantStoreCoordiantor and create a second NSManagedObjectContext for a second threat and somehow I can likely synchronize these threads.
I've gotten things into a mess with performSelectorOnMainThread and performSelectorInBackground calls all over the place etc.
Either I get things working the way I'd like or I get some sort of error with my managedObjectContext's and I'm thinking perhaps its time for a little rewrite instead of trying to salvage what I have.
Can somebody point me to some ideas about what i should use? I can't seem to exactly wrap my head correctly around the multithreaded core-data concepts as to whether I'm supposed to use NSOperationQueues GCD or some other way of doing things.
CoreData has the ability to operate in a multithreaded environment, as long as you keep managed objects and contexts bound to their proper thread/queue.
To scrape the surface see HERE
You can create a background context by initialising it as a "private queue" bound context (see HERE for more details).
Basically, this would mean that in order to use this context properly you have to execute code on it wrapped in the contexts performBlock: or performBlockAndWait: methods.
you can use these methods to queue "tasks" for the context to execute (serial execution queue).
this code will be executed in the background.
You could also create a context that is confined to a specific thread (NSConfinementConcurrencyType) and then only access it in the thread it was created in without the use of the "perform block" methods.
merging the changes in this context to your main context is done by either setting the main context as a parent for the context after initialisation, or by registering to its "did save" notification and merging the changes to the main context (the first method is easier to implement).
One solution to your flow would be:
Create a main queue context (to be passes around to all view controllers)
Create an operation queue
observe a status entity (fetched in the main context) in your view (directly or by using a FRC)
each operation would create a "private queue" context with the main context as parent
when updates are ready to be committed (including status updates), save the private context and the main context using performBlockAndWait: (just to keep the operation coherent)
This will still affect your main thread if there are many updates as all saves will go through the main context.
Try to segment your saves to avoid locking the main context for too long.

Pitfalls of using two persistent store coordinators for efficient background updates

I am searching for the best possible way to update a fairly large core-data based dataset in the background, with as little effect on the application UI (main thread) as possible.
There's some good material available on this topic including:
Session 211 from WWDC 2013 (Core Data Performance Optimization and Debugging, from around 25:30 onwards)
Importing Large Data Sets from objc.io
Common Background Practices from objc.io (Core Data in the Background)
Backstage with Nested Managed Object Contexts
Based on my research and personal experience, the best option available is to effectively use two separate core-data stacks that only share data at the database (SQLite) level. This means that we need two separate NSPersistentStoreCoordinators, each of them having it's own NSManagedObjectContext. With write-ahead logging enabled on the database (default from iOS 7 onwards), the need for locking could be avoided in almost all cases (except when we have two or more simultaneous writes, which is not likely in my scenario).
In order to do efficient background updates and conserve memory, one also needs to process data in batches and periodically save the background context, so the dirty objects get stored to the database and flushed from memory. One can use the NSManagedObjectContextDidSaveNotification that gets generated at this point to merge the background changes into the main context, but in general you don't want to update your UI immediately after a batch has been saved. You want to wait until the background job is completely done and than refresh the UI (recommended in both the WWDC session and objc.io articles). This effectively means that the application main context remains out of sync with the database for a certain time period.
All this leads me to my main question, which is, what can go wrong, if I changed the database in this manner, without immediately telling the main context to merge changes? I'm assuming it's not all sunshine an roses.
One specific scenario that I have in my head is, what happens if a fault needs to be fulfilled for an object loaded in the main context, if the background operation has in between deleted that object from the database? Can this for instance happen on a NSFetchedResultsController based table view that uses a batchSize to fetch objects incrementally into memory? I.e., an object that has not yet been fully fetched gets deleted but than we scroll up to a point where the object needs to get loaded. Is this a potential problem? Can other things go wrong? I'd appreciate any input on this matter.
Great question!
I.e., an object that has not yet been fully fetched gets deleted but
than we scroll up to a point where the object needs to get loaded. Is
this a potential problem?
Unfortunately it'll cause problems. A following exception will be thrown:
Terminating app due to uncaught exception 'NSObjectInaccessibleException', reason: 'CoreData could not fulfill a fault for '0xc544570 <x-coredata://(...)>'
This blog post (section titled "How to do concurrency with Core Data?") might be somewhat helpful, but it doesn't exhaust this topic. I'm struggling with the same problems in an app I'm working on right now and would love to read a write-up about it.
Based on your question, comments, and my own experience, it seems the larger problem you are trying to solve is:
1. Using an NSFetchedResultsController on the main thread with thread confinement
2. Importing a large data set, which will insert, update, or delete managed objects in a context.
3. The import causes large merge notifications to be processed by the main thread to update the UI.
4. The large merge has several possible effects:
- The UI gets slow, or too busy to be usable. This may be because you are using beginUpdates/endUpdates to update a tableview in your NSFetchedResultsControllerDelegate, and you have a LOT of animations queing up because of the large merge.
- Users can run into "Could not fulfill fault" as they try to access a faulted object which has been removed from the store. The managed object context thinks it exists, but when it goes to the store to fulfill the fault the fault it's already been deleted. If you are using reloadData to update a tableview in your NSFetchedResultsControllerDelegate, you are more likely to see this happen than when using beginUpdates/endUpdates.
The approach you are trying to use to solve the above issues is:
- Create two NSPersistentStoreCoordinators, each attached to the same NSPersistentStore or at least the same NSPersistentStore SQLite store file URL.
- Your import occurs on NSManagedObjectContext 1, attached to NSPersistentStoreCoordinator 1, and executing on some other thread(s). Your NSFetchedResultsController is using NSManagedObjectContext 2, attached to NSPersistentStoreCoordinator 2, running on the main thread.
- You are moving the changes from NSManagedObjectContext 1 to 2
You will run into a few problems with this approach.
- An NSPersistentStoreCoordinator's job is to mediate between it's attached NSManagedObjectContexts and it's attached stores. In the multiple-coordinator-context scenario you are describing, changes to the underlying store by NSManagedObjectContext 1 which cause a change in the SQLite file will not be seen by NSPersistentStoreCoordinator 2 and it's context. 2 does not know that 1 changed the file, and you will have "Could not fulfill fault" and other exciting exceptions.
- You will still, at some point, have to put the changed NSManagedObjects from the import into NSManagedObjectContext 2. If these changes are large, you will still have UI problems and the UI will be out of sync with the store, potentially leading to "Could not fulfill fault".
- In general, because NSManagedObjectContext 2 is not using the same NSPersistentStoreCoordinator as NSManagedObjectContext 1, you are going to have problems with things being out of sync. This isn't how these things are intended to be used together. If you import and save in NSManagedObjectContext 1, NSManagedObjectContext 2 is immediately in a state not consistent with the store.
Those are SOME of the things that could go wrong with this approach. Most of these problems will become visible when firing a fault, because that accesses the store. You can read more about how this process works in the Core Data Programming Guide, while the Incremental Store Programming Guide describes the process in more detail. The SQLite store follows the same process that an incremental store implementation does.
Again, the use case you are describing - getting a ton of new data, executing find-Or-Create on the data to create or update managed objects, and deleting "stale" objects that may in fact be the majority of the store - is something I have dealt with every day for several years, seeing all of the same problems you are. There are solutions - even for imports that change 60,000 complex objects at a time, and even using thread confinement! - but that is outside the scope of your question.
(Hint: Parent-Child contexts don't need merge notifications).
Two Persistent Store Coordinators (pscs) is certainly the way to go with large datasets. File locking is faster than the locking within core data.
There's no reason you couldn't use the background psc to create thread confined NSManagedObjectContexts in which each is created for each operation you do in the background. However, instead of letting core data manage the queueing you now need to create NSOperationQueues and/or threads to manage the operations based on what you're doing in the background. NSManagedObjectContexts are free and not expensive. Once you do this you can hang onto your NSManagedObjectContext and only use it during that one operation and/or threads life time and build as many changes as you want and wait until the end to commit them and merge them to the main thread how ever you decide. Even if you have some main thread writes you can still at crucial points in your operations life time refetch/merge back into your threads context.
Also it's important to know that if you're working on large sets of data don't worry about merging contexts so as long as you aren't touching something else. For example if you have class A and class B and you have two seperate opertions/threads to work on them and they have no direct relationship you do not have to merge the contexts if one changes you can keep on rolling with the changes. The only major need for merging background contexts in this fashion is if there are direct relationships faulting. It would be better to prevent this though through some sort of serialization whether it be NSOperationQueue or what ever else. So feel free to work away on different objects in the background just be careful about their relationships.
I've worked on a large scale core data projects and had this pattern work very well for me.
Indeed, this is the best core data scenario you can work with. Almost no Main UI staleness, and easy background management of your data. When you want to tell the Main Context (and maybe a currently running NSFetchedResultsController) you listen for save notifications of the backgroundContext like this:
[[NSNotificationCenter defaultCenter]
addObserver:self selector:#selector(reloadFetchedResults:)
name:NSManagedObjectContextDidSaveNotification
object:backgroundObjectContext];
Then, you can merge changes, but waiting for the Main Thread context to catch them before saving. When you receive the mergeChangesFromContextDidSaveNotification notification the changes are not yet saved. Hence the performBlockAndWait is mandatory, so the Main context gets the changes and then the NSFetchedResultsController updates its values correctly.
-(void)reloadFetchedResults:(NSNotification*)notification
{
NSManagedObjectContext*moc=[notification object];
if ([moc isEqual:backgroundObjectContext])
{
// Delete caches of fethcedResults if you have a deletion
if ([[theNotification.userInfo objectForKey:NSDeletedObjectsKey] count]) {
[NSFetchedResultsController deleteCacheWithName:nil];
}
// Block the background execution of the save, and merge changes before
[managedObjectContext performBlockandWait:^{
[managedObjectContext
mergeChangesFromContextDidSaveNotification:notification];
}];
}
}
There is a pitfall no one has noticed. You can get the save notification before the background context has actually saved the object you want to merge. If you want to avoid problems by a faster Main Context asking for an object that has not been saved yet by the background context, you should (you really should) call obtainPermanentIDsForObjects before any background save. Then you are safe to call the mergeChangesFromContextDidSaveNotification. This will ensure that the merge receives a valid permanent Id for merging.

Core Data stack with only a single context initialized with NSPrivateQueueConcurrencyType

I'm working on an app that requires multiple asynchronous downloads and saving of their contents to Core Data entities. One of the downloads is large and noticed the UI was being blocked while creating/writing to the managed object context. My research led me to read up on concurrent Core Data setups and I started implementing one of these. But I'm running into issues and spending a lot of time correcting things.
Before I continue, I'm thinking about simply setting up a single MOC with NSPrivateQueueConcurrencyType. Nothing I read mentions doing this. This way I could optionally perform MOC operations in the background, or just use the main thread as usual while maintaining a single MOC.
Is this a good approach? If not, what is wrong with it? I doubt this is the right approach because if it is, NSPrivateQueueConcurrencyType dominates NSMainQueueConcurrencyType and there would be no reason to have the latter.
There is nothing wrong with using a NSPrivateQueueConcurrencyType MOC for background tasks.
But you will probably still need a NSMainQueueConcurrencyType MOC.
From the documentation:
The context is associated with the main queue, and as such is tied
into the application’s event loop, but it is otherwise similar to a
private queue-based context. You use this queue type for contexts
linked to controllers and UI objects that are required to be used only
on the main thread.
As an example, for a fetched results controller, you would use the
NSMainQueueConcurrencyType MOC.

When do Core Data merge conflicts occur, exactly?

Say I have an app with the following entities: Page <->> BookPosition <<-> Book. My app's job is to empower users to mix and match pages in one or more picture books using a GUI (on the main thread).
Now, two points:
When a user modifies a book, BookPositions are created and destroyed, but never changed.
A page can be shared across multiple books.
Let's say I synchronize a bevy of picture books (some already exist, some are new) on a background thread using a shared PSC, a separate MOC, and watertight thread isolation. Before I start the process, I make sure users can't modify any one to-be-synced book until after the MOC on the background thread has invoked -[NSManagedObjectContext save:] for that book, and correspondingly generated an NSManagedObjectContextDidSaveNotification, which is merged in on the main thread using mergeChangesFromContextDidSaveNotification:.
Me being naive, I didn't expect this approach would ever cause a merge conflict since - ostensibly - none of the entities in question ever have the potential to change on one thread before a call to mergeChangesFromContextDidSaveNotification: can bring both MOCs up to date. However, merge conflicts will occur on the background thread if users make a change on the main thread that mutates the set of BookPositions associated with a Page shared between an already-synced book and a to-be-synced book.
The takeaway here is that changing (adding or removing) entities on the "many" side of a one-to-many relationship results in a new version of the object on the "one" side, even when nothing else on the "one" side has changed. Everything makes sense so far, and I figured I could resolve the problem by setting a merge policy (either NSMergeByPropertyStoreTrumpMergePolicy or NSMergeByPropertyObjectTrumpMergePolicy should work here?) on the background MOC. Unfortunately (and unexpectedly), after setting a merge policy, I wound up with conflicts on the main thread, which is precisely why I'm confused.
I'd like to know at what point the data in my persistent store is going out of sync with my MOC on the main thread. Does the discrepancy occur as the background MOC is saving / before it has a chance to post a NSManagedObjectContextDidSaveNotification?
I seem to be able to resolve the problem by also setting a merge policy on the main MOC, but if possible, I'd like to know exactly what's going on here.

Resources