IOS Core Data asynchronous saving

IOS Core Data asynchronous saving - ios

Problem Statement
I am having trouble saving server side data asynchronously.
Structure
I am using the following structure of NSManagedObjectContext in order of parent to child:
writerManagedObjectContext (NSPrivateQueueConcurrencyType)
masterManagedObjectContext (NSMainQueueConcurrencyType)
backgroundManagedObjectContext (NSPrivateQueueConcurrencyType)
Code
I am using the following code to save data
[backgroundManagedObjectContext performBlock:^{
[backgroundManagedObjectContext save:nil];
[masterManagedObjectContext performBlock:^{ // Starts blocking UI from here
[masterManagedObjectContext save:nil];
[writerManagedObjectContext performBlock:^{
[writerManagedObjectContext save:nil];
}]
}]
}]
Problem
The code saves fine. The backgroundManagedObjectContext also saves asynchronously. However, both masterManagedObjectContext and writerManagedObjectContext refuse to be saved asynchronously and blocks the UI thread. (I know it blocks the UI thread because I tried to perform actions which have nothing to do with Core Data and they were also blocked. This is not a case of the persistent coordinator not being accessible)
Questions
What would be the reason the above code blocks the main thread?
Am I correct in assuming I can call the above code from anywhere since the save will be called in each respective thread/context?
Any help would be much appreciated.
Edit
http://floriankugler.com/2013/04/29/concurrent-core-data-stack-performance-shootout/
Apparently the freeze comes from the attempt to propagate to the parent NSManagedObjectContext. The article seems to elude to the fact that it is impossible to have a truly asynchronous save to the main context.
The data was heavily linked, 5MB, and took approximately 40s to save to psc. I don't think I'll use a parallel structure as described in the post since the code base is already large as it is. I would appreciate any strategies I can utilize to decrease this freeze.

Even though backgroundManagedObjectContext is a private queue context it still propagates its changes up to masterManagedObjectContext, being its parent. This could be where it's choking up. Merging changes from child contexts still takes up CPU time, the effects of which become more pronounced in a busy queue like the UI's.
You can always use Instruments to analyze what's going on.
If your particular use case permits, try setting backgroundManagedObjectContext.persistentStoreCoordinator to the same psc as writerManagedObjectContext and not make it a child of masterManagedObjectContext.
Better yet, use an awesome framework like MagicalRecord. Doesn't shield you from problems like this, but less code makes things (arguably) a bit easier to debug.

Related

Universal context for CoreData

I use NSPersistentContainer as a dependency in my classes. I find this approach quite useful, but there is a dilemma: I don't know in which thread my methods will be called. I found a very simple solution for this
extension NSPersistentContainer {
func getContext() -> NSManagedObjectContext {
if Thread.isMainThread {
return viewContext
} else {
return newBackgroundContext()
}
}
}
Looks wonderful but I still have a doubt is there any pitfalls? If it properly works, why on earth Core Data confuses us with its contexts?

It's OK as long as you can live with its inherent limitations, i.e.
When you're on the main queue, you always want the viewContext, never any other one.
When you're not on the main queue, you always want to create a new, independent context.
Some drawbacks that come to mind:
If you call a method that has an async completion handler, that handler might be called on a different queue. If you use this method, you might get a different context than when you made the call. Is that OK? It depends what you're doing in the handler.
Changes on one background context are not automatically available in other background contexts, so you run the risk of having multiple contexts with conflicting changes.
The method suggests a potential carelessness about which context you're using. It's important to be aware of which context you're using, because managed objects fetched on one can't be used with another. If your code just says, hey give me some context, but doesn't track the contexts properly, you increase the chance of crashes from accidentally crossing contexts.
If your non-main-queue requirements match the above, you're probably better off using the performBackgroundTask(_:) method on NSPersistentContainer. You're not adding anything to that method here.
[W]hy on earth Core Data confuses us with its contexts?
Managed object contexts are a fundamental part of how Core Data works. Keeping track of them is therefore a fundamental part of having an app that doesn't corrupt its data or crash.

Why does RestKit save all NSManagedObjectContexts via performBlockAndWait?

I noticed that when saving NSManagedObjectContexts, RestKit wraps the save call on each NSManagedObjectContext with a performBlockAndWait.
https://github.com/RestKit/RestKit/blob/development/Code/CoreData/NSManagedObjectContext%2BRKAdditions.m#L64
It was my understanding of managing parent and child NSManagedObjectContexts that only a NSManagedObjectContext with type MainQueueConcurrencyType should be saved this way (and that is usually the child context of another NSManagedObjectContext of type PrivateQueueConcurrencyType which is what is actually associated with persistentStoreCoordinator). I thought that the idea was that saving to the persistent store (ie disk) is a longer operation and doesn't, and shouldn't, be waited for. Where am I going wrong?

Everything you do with a ManagedObjectContext has to be done on that context's dispatch queue. The easiest way to make sure that happens is to do it in a block called by performBlock or performBlockAndWait. If there is code later in the method depending on the result of the block, performBlockAndWait is the way to go. If you have to add these blocks to older Core Data code, as in the case of RestKit, wrapping NSManagedObjectContext calls in a performBlockAndWait is a not-too-painful way to make your Core Data code (more) thread-safe.

Behavior differences between performBlock: and performBlockAndWait:?

I'm creating an NSManagedObjectContext in a private queue to handle data updates I take from files and/or services:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
Since I'm using a private queue, I don't fully understand the difference between performBlock: and performBlockAndWait: methods... To perform my data updates I'm currently doing this:
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
// Save context
[privateContext save:nil];
dispatch_async(dispatch_get_main_queue(), ^{
// Notify update to user
});
}];
In this case, my data updates are made synchronoulsy and sequentially, so I suppose that is the correct place to save the context, right? If I'm doing something wrong, I'd appreciate if you let me know. On the other hand, would this code be equivalent?:
[privateContext performBlockAndWait: ^{
// Parse files and/or call services and parse
// their responses
// Save context
[privateContext save:nil];
}];
// Notify update to user
Again I guess that is the correct place to save the context... what are the differences between both methods (if any, in this case)?
What if instead of performing synchronous service calls or files parsing, I need to perform asynchronous service calls? How would these data updates be managed?
Thanks in advance

You are correct in that anything you want to do with a MOC must be done within either performBlock or performBlockAndWait. Note that retain/release is thread safe for managed objects, so you don't have to be inside one of those blocks to retain/release reference counts on managed objects.
They both utilize a synchronous queue to process messages, which means that only one block will execute at a time. Well, that's almost true. See the descriptions of performBlockAndWait. In any event, the access to the MOC will be serialized such that only one thread is accessing the MOC at a time.
tl;dr Don't worry about the difference, and always use performBlock.
Factual Differences
There are a number of differences. I'm sure there are more, but here are the ones that I think are most important to understand.
Synchronous vs. Asynchronous
performBlock is asynchronous, in that it returns immediately, and the block is executed at some time in the future, on some undisclosed thread. All blocks given to the MOC via performBlock will execute in the order they were added.
performBlockAndWait is synchronous, in that the calling thread will wait until the block has executed before returning. Whether the block runs in some other thread, or runs in the calling thread is not all that important, and is an implementation detail that can't be trusted.
Note, however, that it could be implemented as "Hey, some other thread, go run this block. I'm gonna sit here doing nothing until you tell me it's done." Or, it could be implemented as "Hey, Core Data, give me a lock that prevents all those other blocks from running so I can run this block on my own thread." Or it could be implemented in some other way. Again, implementation detail, which could change at any time.
I'll tell you this though, the last time I tested it, performBlockAndWait executed the block on the calling thread (meaning the second option in the above paragraph). This is only really information to help you understand what is going on, and should not be relied upon in any way.
Reentrancy
performBlock is always asynchronous, and is thus not reentrant. Well, some may consider it reentrant, in that you can call it from within a block that was called with performBlock. However, if you do this, all calls to performBlock will return immediately, and the block will not execute until at least the currently executing block completely finishes its work.
[moc performBlock:^{
doSomething();
[moc performBlock:^{
doSomethingElse();
}];
doSomeMore();
}];
These functions will always be executed in this order:
doSomething()
doSomeMore()
doSomethingElse()
performBlockAndWait is always synchronous. Furthermore, it is also reentrant. Multiple calls will not deadlock. Thus, if you end up calling performBlockAndWait while you are inside a block that was being run as a result of another performBlockAndWait, then it's OK. You will get the expected behavior, in that the second call (and any subsequent calls) will not cause a deadlock. Furthermore, the second one will completely execute before it returns, as you would expect.
[moc performBlockAndWait:^{
doSomething();
[moc performBlockAndWait:^{
doSomethingElse();
}];
doSomeMore();
}];
These functions will always be executed in this order:
doSomething()
doSomethingElse()
doSomeMore()
FIFO
FIFO stands for "First In First Out" which means that blocks will be executed in the order in which they were put into the internal queue.
performBlock always honors the FIFO structure of the internal queue. Every block will be inserted into the queue, and only run when it is removed, in FIFO order.
By definition, performBlockAndWait breaks FIFO ordering because it jumps the queue of blocks that have already been enqueued.
Blocks submitted with performBlockAndWait do not have to wait for other blocks that are running in the queue. There are a number of ways to see this. One simple one is this.
[moc performBlock:^{
doSomething();
[moc performBlock:^{
doSomethingElse();
}];
doSomeMore();
[moc performBlockAndWait:^{
doSomethingAfterDoSomethingElse();
}];
doTheLastThing();
}];
These functions will always be executed in this order:
doSomething()
doSomeMore()
doSomethingAfterDoSomethingElse()
doTheLastThing()
doSomethingElse()
It's obvious in this example, which is why I used it. Consider, however, if your MOC is getting stuff called on it from multiple places. Could be a bit confusing.
The point to remember though, is that performBlockAndWait is preemptive and can jump the FIFO queue.
Deadlock
You will never get a deadlock calling performBlock. If you do something stupid inside the block, then you could deadlock, but calling performBlock will never deadlock. You can call it from anywhere, and it will simply add the block to the queue, and execute it some time in the future.
You can easily get deadlocks calling performBlockAndWait, especially if you call it from a method that an external entity can call or indiscriminately within nested contexts. Specifically, you are almost guaranteed to deadlock your applications if a parent calls performBlockAndWait on a child.
User Events
Core Data considers a "user event" to be anything between calls to processPendingChanges. You can read the details of why this method is important, but what happens in a "user event" has implications on notifications, undo management, delete propagation, change coalescing, etc.
performBlock encapsulates a "user event" which means the block of code is automatically executed between distinct calls to processPendingChanges.
performBlockAndWait does not encapsulate a "user event." If you want the block to be treated as a distinct user event, you must do it yourself.
Auto Release Pool
performBlock wraps the block in its own autoreleasepool.
performBlockAdWait does not provide a unique autoreleasepool. If you need one, you must provide it yourself.
Personal Opinions
Personally, I do not believe there are very many good reasons to use performBlockAndWait. I'm sure someone has a use case that can't be accomplished in any other way, but I've yet to see it. If anyone knows of that use case, please share it with me.
The closest is calling performBlockAndWait on a parent context (don't ever do this on a NSMainConcurrencyType MOC because it could lock up your UI). For example, if you want to ensure that the database has completely saved to disk before the current block returns and other blocks get a chance to run.
Thus, a while ago, I decided to treat Core Data as a completely asynchronous API. As a result, I have a whole lot of core data code, and I do not have one single call to performBlockAndWait, outside of tests.
Life is much better this way. I have much fewer problems than I did back when I thought "it must be useful or they wouldn't have provided it."
Now, I simply no longer have any need for performBlockAndWait. As a result, maybe it has changed some, and I just missed it because it no longer interests me... but I doubt that.

Core Data main context concurrency

So I've been reading about and working with Core Data pretty extensively over the past few weeks, and I've ended up implementing my Core Data stack using the following setup:
Save context - root context tied to the store, on PrivateQueue
Main context - running on main thread, child of save context
Edit contexts - created on-demand in any thread to edit UI data in the background as children of main context
With this setup, I haven't been having any issues, and I've taken the necessary precautions to prevent multiple background edit contexts from calling [editMoc save] simultaneously, but I'm not seeing how this is safe relative to the main thread/context. Specifically, all the code I keep seeing while using this setup is like the following:
[edit performBlock^{
[edit save:nil];
[main performBlock:^{
[main save:nil];
[root performBlock...]
}];
}];
Now the thing I have not been able to figure out is this: every time the child context saves, the data is being propagated to the main context, but nothing is happening to ensure that the data isn't currently being read from/acted upon in the main context, right? Essentially, I feel like something like this would be more appropriate:
[edit performBlock^{
dispatch_async(dispatch_get_main_queue(), ^{
[edit save:nil];
});
[main performBlock:^{
[main save:nil];
[root performBlock....]
}];
}];
At least if it were done that way, we'd ensure that the propagation is safe relative to the main thread/UI, no? This sort of ties in with my other question of "why should the main context really be associated with the main thread anyways?" Right now if I need to execute a fetch, I do it from the main context and I can understand that it then makes sense for it to be on the main thread, but if propagations can just happen "whenever" (since we aren't doing them on the main thread) what is the ultimate point of having it tied to the main thread? I don't even need to do a fetch because the data has seemingly unsafely arrived in the main context from the child save, yeah? From there a simple [tableView reload] suffices.
Any help is appreciated!
EDIT:
So I think I've found the answer to my question, which is that Core Data does in fact internally push the changes from the child context to the parent context using the correct thread (i.e. using the parent context's queue - which in this case would be the main queue). Meaning that it would in fact have to wait for anything else happening on the main thread that may have been using that data before it will dirty the main context.

Requesting Scalar Values from main thread managedobjectcontext from background thread

I have a method that runs in a background thread using a copy of NSManagedObjectContext which is specially generated when the background thread starts as per Apples recommendations.
In this method it makes a call to shared instance of a class, this shared instance is used for managing property values.
The shared instance that managed properties uses a NSManagedObjectContext on the main thread, now even though the background thread method should not use the NSManagedObjectContext on the main thread, it shouldn't really matter if the shared property manager class does or does not use the such a context as it only returns scalar values back to the background thread (at least that's my understanding).
So, why does the shared property class hang when retrieving values via the main threads context when called from the background thread? It doesn't need to pass an NSManagedObject or even update one so I cannot see what difference it would make.
I can appreciate that my approach is probably wrong but I want to understand at a base level why this is. At the moment I cannot understand this whole system enough to be able to think beyond Apples recommended methods of implementation and that's just a black magic approach which I don't like.
Any help is greatly appreciated.

Does using:
[theContext performBlock:^{
// do stuff on the context's queue, launch asynchronously
}];
-- or --
[theContext performBlockAndWait:^{
// do stuff on the context's queue, run synchronously
}];
-- just work for you? If so, you're done.
If not, take a long, hard look at how your contexts are setup, being passed around, and used. If they all share a root context, you should be able to "move" data between them easily, so long as you lookup any objectIDs always on your current context.
Contexts are bound to threads/queues, basically, so always use a given context as a a reference for where to do work. performBlock: is one way to do this.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart