Core data multithreading performance - ios

I am developing an application that uses Core Data for internal storage. This application has the following functionalities :
Synchronize data with a server by downloading and parsing a large XML file then save the entries with core data.
Allow user to make fetches (large data fetches) and CRUD operations.
I have read through a lot and a lot of documentation that there are several patterns to follow in order to apply multithreading with Core Data :
Nested contexts : this patterns seems to have many performance issues (children contexts block ancestor when making fetches).
Use one main thread context and background worker contexts.
Use a single context (main thread context) and apply multithreading with GCD.
I tried the 3 mentioned approaches and i realized that 2 last ones work fine. However i am not sure if these approaches are correct when talking about performance.
Is there please a well known performant pattern to apply in order to make a robust application that implements the mentioned functionalities ?

rokridi,
In my Twitter iOS apps, Retweever and #chat, I use a simple two MOC model. All database insertions and deletions take place on a private concurrent insertionMOC. The main MOC merges through -save: notifications from the insertionMOC and during merge processing emits a custom UI update notification. This lets me work in a staged fashion. All tweets come in to the app are processed on the background and are presented to the UI when everything is done.
If you download the apps, #chat's engine has been modernized and is more efficient and more isolated from the main thread than Retweever's engine.
Anon,
Andrew

Apple recommends using separate context for each thread.
The pattern recommended for concurrent programming with Core Data is
thread confinement: each thread must have its own entirely private
managed object context. There are two possible ways to adopt the
pattern: Create a separate managed object context for each thread
and share a single persistent store coordinator. This is the
typically-recommended approach. Create a separate managed object
context and persistent store coordinator for each thread. This
approach provides for greater concurrency at the expense of greater
complexity (particularly if you need to communicate changes between
different contexts) and increased memory usage.
See the apple Documentation

As per apple documentation use Thread Confinement to Support Concurrency
Creating one managed object context per thread. It will make your life easy. This is for when you are parsing large data in background and also fetching data on main thread to display in UI.
About the merging issue there are some best ways to do them.
Never pass objects between thread, but pass object ids to other thread if necessary and access them from that thread, for example when you are saving data by parsing xml you should save them on current thread moc and get the ids of them, and pass to UI thread, In UI thread re fetch them.
You can also register for notification and when one moc will change you will get notified by the user info dictionary which will have updated objects, you can pass them to merge context method call.

Related

Core Data Concurrency

I would like to get some suggestion for making core data operation concurrent in my project. My project is running since two years, so that it has many implementations which can be optimized based on the availability of new features in objectiveC. Mainly, I am looking for optimizing CoreData operation.
Currently most of the data operations are done using main managed object context. Recently, I have implemented a new feature to download a big set of data and inserting in to database using core data after login. This was supposed to be execute in parallel with other operations in the application. Now I realized that the code written for core data is executing in the main thread, because the UI of application is blocking during the coredata operation. So I referred many blogs and came to know that there are two strategies in which core data concurrency can be achieved, Notifications with the help of multiple contexts and parent/child managed object contexts.
I tried the parent/child strategy as Apple is not preferring the other strategy. But I am getting random crashes with the exception “Collection was mutated while being enumerated” on executeFetchRequest. This exception starts happening after implementing the parent/child strategy. Can anyone help me to solve this issue?
Yeah , i know there are not so many blogs that describe efficient use of core data in project but luckily i found one... which points to your problem properly... check here-> https://medium.com/soundwave-stories/core-data-cffe22efe716#.3wcpw1ijo
also your exception is occurring because you are updating your database while it is being used somewhere to remove this exception you can do this like:
if you are fetching data in array or dictionary then do change statement like this
NSDictionary *myDict = [[coreDataDectionary] mutableCopy];
Now perform any operation on this array or dictionary which you fetch from database, it wont show any exception.
Hope this helps you.
Try this :
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, NULL), ^{
// DATA PROCESSING
dispatch_async(dispatch_get_main_queue(), ^{
// UPDATE UI
});
}
You should use completionBlock into your code. here is tutorial and explanation.
It will allow you to not freeze your UI application even if your download is not finished.
the execution of the code will continue even if the code inside the block isn't finished yet. There will be a callback action inside the block when the download will be over.
Use this Core Data stack to minimize UI locks when importing large datasets:
One main thread MOC with its own PSC.
One background MOC with its own PSC.
Merge changes into main thread MOC on background MOC's save notifications.
Yes, you can – and should – use two independent PSCs (NSPersistentStoreCoordinator) pointing to the same .sqlite file. It will reduce overall locking time to just SQLite locks, avoiding PSC-level locks, so overall UI locking time will be [SQLite write lock] + [main thread MOC read].
You can use background MOC with NSConfinementConcurrencyType within a background thread or even better within an NSOperation – I found it very convenient to process data and feed it to Core Data on the same thread.
Import in batches. Choose batch size empirically. Reset the background MOC after each save.
When processing really large datasets, with hundreds of thousands of objects, do not use refreshObject:mergeChanges: with main thread MOC on every save. It is slow and eventually will consume all of the available memory. Reload your FRCs instead.
And about "Collection was mutated while being enumerated". To-many relationships in Core Data are mutable sets, so you have to make a copy, or better sort them into NSArrays before iterating.

Magical Record with concurrency

I'm in the middle of development of an iOS application after working quite some time with Core Data and Magical Record and having an error:
error: NULL _cd_rawData but the object is not being turned into a fault
I didn't know Core Data before this project and as it turns out I was very naive to think I can just work with Magical Record without worrying about concurrency, as I haven't dedicated any thoughts/work about managed contexts for the main thread and background threads.
After A LOT of reading about Core Data Managed Object Contexts and Magical Record, I understand that:
NSManagedObjects are not thread safe.
NSManagedObjectId IS thread safe.
I can use: Entity *localEntity = [entity MR_inContext:localContext] of Magical Record to work with an entity in a background thread's context.
I should use Magical Record's saveWithBlock:completion: and saveWithBlockAndWait: methods to get a managed context to use for background threads.
A little information regarding my application:
I'm using the latest version of Magical Record which is 2.2.
I have a backend server which my application talks to a lot.
Their communication is similar to Whatsapp, as it uses background threads for communicating with the server and updating managed objects upon successful responses.
I'm wrapping the model with DataModel objects that hold the managed objects in arrays for quick referencing of UI/background use.
Now - my questions are:
Should I fetch only from the UI thread? Is it okay that I'm holding the managed objects in DataModel objects?
How can I create a new entity from a background thread and use the newly created entity in the DataModel objects?
Is there a best design scenario that I should use? specifically when sending a request to the server and getting a response, should I create a new managed context and use it throughout the thread's activity?
Let me know if everything is clear. If not, I'll try and add clarity.
Any help or guidelines would be appreciated.
Thanks!
I'm not working with MagicalRecord, but these questions are more related to CoreData than to MagicalRecord, so I'll try to answer them :).
1) Fetching from main(UI) thread
There are many ways how to design app model, so two important things I've learned using CoreData for few years:
when dealing with UI, always fetch objects on main thread. As You correctly stated NSManagedObjects are not thread safe so you can't (well, can, but shouldn't) access their data from different thread. NSFetchedResultsController is Your best friend when you need to display long lists (eg. for messages – but watch out for request's batchSize).
you should design your storage and fetches to be fast and responsive. Use indexes, fetch only needed properties, prefetch relationships etc.
on the other hand if you need to fetch from large amount of data, you can use context on different thread and transfer NSManagedObjectIDs only. Let's say your user has huge amount of messages and you want to show him latest 10 from specific contact. You can create background context (private concurrency), fetch these 10 message IDs (NSManagedObjectIDResultType), store them in array (or any other suitable format for you), return them to your main thread and fetch those IDs only. Note that this approach speed things up if fetch takes long because of predicate/sortDescriptor not if the "problem" is in the process of turning faults into objects (eg. large UIImage stored in transformable attribute :) )
2) Creating entity in background
You can create the object in background context, store it's NSManagedObjectID after you save context (object has only temporary ID before save) and send it back to your main thread, where you can perform fetch by ID and get the object in your main context.
3) Working with background context
I don't know if it's exactly the best, but I'm pretty much satisfied with NSManagedObjectContext observation and merging from notifications. Checkout:
mergeChangesFromContextDidSaveNotification:
So, you create background context, add main context as observer for changes (NSManagedObjectContextObjectsDidChangeNotification) and background context automatically sends you notifications (every time you perform save) about all of it's changes – inserted/updated/deleted objects (no worries, you can just merge it by calling mergeChangesFromContextDidSaveNotification:). This has many advantages, such as:
everything is updated automatically (every object you have fetched in "observing context" gets updated/deleted)
every merge runs in memory (no fetches, no persisting on main thread)
if you implement NSFetchedResultsController's delegate method, everything updates automatically (not exactly everything – see below)
On the other side:
take care about merging policy (NSMangedObjectContext mergePolicy)
watchout for referencing managed objects that got deleted from background (or just another context)
NSFetchedResultsController updates only on changes to "direct" attributes (checkout this SO question)
Well, I hope it answers your questions. If everything comes up, don't hesitate to ask :)
Side note about child contexts
Also take a peek to child contexts. They can be powerful too. Basically every child context sends it's changes to parent context on save (in case of "base" context (no parent context), it sends it's changes to persistent coordinator).
For example when you are creating edit/add controller you can create child context from your main context and perform all changes in it. When user decides to cancel the operation, you just destroy (remove reference) the child context and no changes will be stored. If user decides to accept changes he/she made, save the child context and destroy it. By saving child context all changes are propagated to it's parent store (in this example your main context). Just be sure to also save the parent context (at some point) to persist these changes (save: method doesn't do the bubbling). Checkout documentation of managing parent store.
Happy coding!

How to dynamically use MOC depending on thread to protect core data

I've read through the materials regarding core data and threading and understand the principles of a separate MOC for each thread. My question is, what's the best way to dynamically determine whether to use a different MOC or the main one. I have some methods that are sometimes called on the main thread, sometimes in background. Is dynamically detecting thread not recommended or is it okay? Any pitfalls? Or do people just write separate methods for the background processes?
Some additional detail...i have a refresh process that performs a bunch of updates off the main thread (so not to lock the UI while user is waiting) using a simple performSelectorInBackground. This process moves thru steps serially so i dont have to worry about multiple things accessing DB on THIS thread, obviously the trick is keeping the main and background safe. I have implemented using a separate context and merging in other places, but i recently rearchitected and am now using methods in the background i wasnt before. So i wanted to rewrite those, use the separate context, but sometimes ill be hitting them on the main thread and can access main MOC just fine.
You do not give much detail about how you are managing your background operation and what you are doing with it, so it is pretty difficult to suggest anything.
In general, since creating a MOC is a pretty fast operation, you could create a new temporary MOC each time you need one in read-only mode (e.g. for data lookup). If you also have updates (e.g., adding new object or modifying existing ones), you should factor in the cost of merging, thus creating temporary MOCs each time could not be a good approach.
Another good approach could be creating a child context in your background thread.
But, as I said, it all depends on what you are doing.
Have a look at this good post about multi-threaded Core Data usage: Multi-Context CoreData. It describes a couple of scenarios and the solutions for them.
EDIT:
You could certainly use isMainThread to discriminate between the two cases (where you can use the main MOC and when you need a new one). That is what that method is for (and it is surely not expensive).
On the other hand, if you want a cleaner implementation, the best approach IMO would be creating a child MOC (which simplifies a lot the merging process - it becomes almost automatic, since you just need to save the parent context after saving the temporary context).
You'll need a new NSManagedObjectContext for each thread, and you'll need to create new versions of your NSManagedObjects from that thread's new MOC. Read #sergio's answer regarding the pros/cons of that approach.
To check if you're on the main thread, you can use [NSThread isMainThread] and make determinations that way. Or, when you're spinning up a new thread to crunch on CoreData, also create a new MOC.
A common approach is to associate each managed object context with a particular serial dispatch queue. So there's one for the main queue, and you can dynamically create them otherwise.
Once you're tying these things to queues, you can use dispatch_queue_set_specific to attach a particular context to a particular queue and dispatch_get_specific to get the context for the current queue. They both turned up in iOS 5 so you'll see some iOS 4-compatible code that jumps through much more complicated hoops but you don't really need to worry about it any more.
Alternatively, if your contexts are tired to particular NSRunLoops or NSThreads, store the context to [[NSThread currentThread] threadDictionary] — it's exactly what it's there for.

Core Data stack with only a single context initialized with NSPrivateQueueConcurrencyType

I'm working on an app that requires multiple asynchronous downloads and saving of their contents to Core Data entities. One of the downloads is large and noticed the UI was being blocked while creating/writing to the managed object context. My research led me to read up on concurrent Core Data setups and I started implementing one of these. But I'm running into issues and spending a lot of time correcting things.
Before I continue, I'm thinking about simply setting up a single MOC with NSPrivateQueueConcurrencyType. Nothing I read mentions doing this. This way I could optionally perform MOC operations in the background, or just use the main thread as usual while maintaining a single MOC.
Is this a good approach? If not, what is wrong with it? I doubt this is the right approach because if it is, NSPrivateQueueConcurrencyType dominates NSMainQueueConcurrencyType and there would be no reason to have the latter.
There is nothing wrong with using a NSPrivateQueueConcurrencyType MOC for background tasks.
But you will probably still need a NSMainQueueConcurrencyType MOC.
From the documentation:
The context is associated with the main queue, and as such is tied
into the application’s event loop, but it is otherwise similar to a
private queue-based context. You use this queue type for contexts
linked to controllers and UI objects that are required to be used only
on the main thread.
As an example, for a fetched results controller, you would use the
NSMainQueueConcurrencyType MOC.

Different ManagedObjectContexts for foreground and background threads

I am new to iOS dev. I am writing an iOS app that allows the user to read/write core data records. These records are to be synced to a server across http. I have a set of chained (serially) NSOperations running in a background thread that perform the sync.
The user can read/write at the same time as the sync us running. My plan is to use two managedObjectContexts within the app (both using the same persistentStoreCoordinator), one for foreground and one for background.
All background threads created by my NSOperations will run serially and will use the background MOC. All UI-based stuff will use the foreground MOC.
My question is this: is this an acceptable iOS core data pattern? Can I happily have reads/writes occurring against the same model database within these two MOCs without worrying about locking and concurrency issues?
Thanks very much.
This is a common core-data pattern and one the framework was designed to accommodate.
If you are managing the threads yourself you need to use a technique called "Thread Confinement" you can read more about it in the documentation in the section titled "Concurrency with Core Data".
In addition to thread confinement, there are also new features in iOS 5.0 designed to help manage concurrency. An NSManagedObjectContext can now be configured with a NSManagedObjectContextConcurrencyType. You can chose between a NSMainQueueConcurrencyType and NSPrivateQueueConcurrencyType.
A context with NSMainQueueConcurrencyType runs on the main thread and can be used to serve the UI. A context with NSPrivateQueueConcurrencyType is used for background tasks.
To use the private context you interact with it through the performBlock: and performBlockAndWait: methods to ensure your code is executed on the correct thread. To use the main queue context you can interact with it as normal, or if your code is not running on the main thread, by using the block methods.
These new features are not discussed in great detail in the documentation, there is some information in the section "Core Data Release Notes for iOS v5.0". However there is a much more insightful discussion in the WWDC 2012 session video: "Session 214 - Core Data Best Practices".

Resources