I have a method similar to this:
- (void)handleUpdate
{
dispatch_sync(dispatch_get_main_queue(), ^{
NSArray *objectIDs = [self.objectsInMainContext valueForKeyPath:#"objectID"];
[self.privateContext performBlockAndWait: ^{
// Some processing
}];
});
}
What I called mainContext is associated to main queue, and privateContext is associated to a private queue and is child of the mainContext. This method is called from the privateContext's private queue, and it is not nil when the performBlockAndWait: call is reached, but execution does not enter the block and neither any code after this method is reached...
What could I be missing here?
Thanks in advance
EDIT: I get no error in Xcode, the breakpoints I've set in the code inside the performBlockAndWait: block and after this method call are simply not reached.
EDIT 2: I corrected the code snippet, I wasn't accessing the mainContext but an array of objects associated to mainContext.
If handleUpdate is called from the private queue as you say, then this is a classic deadlock.
You call dispatch_sync on the private queue. That means "wait until this returns before continuing on this queue." Then, inside that block, you enqueue another block that has to wait to run on the private queue and won't return until it does. The queue is now waiting on itself and will never progress.
I suspect you want dispatch_sync to be dispatch_async here.
Alright, there is at least 1 thing that is wrong in your situation:
Calling performBlockAndWait: on the main thread (via passing it to the main queue). The documentation says that performBlockAndWait:
Synchronously performs a given block on the receiver’s queue.
As a result of that call, you will get a dead lock on the main thread (as you use dispatch_sync outside.
UPDATE on the dispatch_async way:
The problem will not disappear. That's because you'll be still using the main queue (which implies the main thread in the end) and the main CD context (which also implies the main thread). The problem is that you make the main thread to wait until the job is done on the same (main) thread - just a dead lock out of the box.
You have several problems.
First, you are using GCD directly to serialize access to a NSMainQueueConcurrency MOC. You should never use anything but the CoreData synchronization mechanisms for accessing CoreData. Yes, it is safe to access a NSMainQueueConcurrencyType MOC from the main thread. However, you should never use GCD directly... especially dispatch_sync.
Second, you are using dispatch_sync which is not reentrant. It can, and will, cause deadlocks. You should hardly ever use this function. Consider it a very sharp instrument, to be used where it is the only solution to a very specific problem.
Third, you are calling performBlockAndWait on a child context, which should never be done because it can cause deadlocks. In general, you should avoid the non-asynchronous threading models unless you really know what you are doing.
Your code should be something more like this. Each performBlock call will post a block onto a message queue, which will get processed independently of any waiting thread.
- (void)handleUpdate
{
[self.mainContext performBlock:^{
NSArray *objectIDs = [self.objectsInMainContext valueForKeyPath:#"objectID"];
[self.privateContext performBlock: ^{
// Some processing with objectIDs
}];
});
}
Related
I need a clarifications on how dispatch_queues is related to reentrancy and deadlocks.
Reading this blog post Thread Safety Basics on iOS/OS X, I encountered this sentence:
All dispatch queues are non-reentrant, meaning you will deadlock if
you attempt to dispatch_sync on the current queue.
So, what is the relationship between reentrancy and deadlock? Why, if a dispatch_queue is non-reentrant, does a deadlock arise when you are using dispatch_sync call?
In my understanding, you can have a deadlock using dispatch_sync only if the thread you are running on is the same thread where the block is dispatch into.
A simple example is the following. If I run the code in the main thread, since the dispatch_get_main_queue() will grab the main thread as well and I will end in a deadlock.
dispatch_sync(dispatch_get_main_queue(), ^{
NSLog(#"Deadlock!!!");
});
Any clarifications?
All dispatch queues are non-reentrant, meaning you will deadlock if
you attempt to dispatch_sync on the current queue.
So, what is the relationship between reentrancy and deadlock? Why, if
a dispatch_queue is non-reentrant, does a deadlock arise when you are
using dispatch_sync call?
Without having read that article, I imagine that statement was in reference to serial queues, because it's otherwise false.
Now, let's consider a simplified conceptual view of how dispatch queues work (in some made-up pseudo-language). We also assume a serial queue, and don't consider target queues.
Dispatch Queue
When you create a dispatch queue, basically you get a FIFO queue, a simple data structure where you can push objects on the end, and take objects off the front.
You also get some complex mechanisms to manage thread pools and do synchronization, but most of that is for performance. Let's simply assume that you also get a thread that just runs an infinite loop, processing messages from the queue.
void processQueue(queue) {
for (;;) {
waitUntilQueueIsNotEmptyInAThreadSaveManner(queue)
block = removeFirstObject(queue);
block();
}
}
dispatch_async
Taking the same simplistic view of dispatch_async yields something like this...
void dispatch_async(queue, block) {
appendToEndInAThreadSafeManner(queue, block);
}
All it is really doing is taking the block, and adding it to the queue. This is why it returns immediately, it just adds the block onto the end of the data structure. At some point, that other thread will pull this block off the queue, and execute it.
Note, that this is where the FIFO guarantee comes into play. The thread pulling blocks off the queue and executing them always takes them in the order that they were placed on the queue. It then waits until that block has fully executed before getting the next block off the queue
dispatch_sync
Now, another simplistic view of dispatch_sync. In this case, the API guarantees that it will wait until the block has run to completion before it returns. In particular, calling this function does not violate the FIFO guarantee.
void dispatch_sync(queue, block) {
bool done = false;
dispatch_async(queue, { block(); done = true; });
while (!done) { }
}
Now, this is actually done with semaphores so there is no cpu loops and boolean flag, and it doesn't use a separate block, but we are trying to keep it simple. You should get the idea.
The block is placed on the queue, and then the function waits until it knows for sure that "the other thread" has run the block to completion.
Reentrancy
Now, we can get a reentrant call in a number of different ways. Let's consider the most obvious.
block1 = {
dispatch_sync(queue, block2);
}
dispatch_sync(queue, block1);
This will place block1 on the queue, and wait for it to run. Eventually the thread processing the queue will pop block1 off, and start executing it. When block1 executes, it will put block2 on the queue, and then wait for it to finish executing.
This is one meaning of reentrancy: when you re-enter a call to dispatch_sync from another call to dispatch_sync
Deadlock from reentering dispatch_sync
However, block1 is now running inside the queue's for loop. That code is executing block1, and will not process anything more from the queue until block1 completes.
Block1, though, has placed block2 on the queue, and is waiting for it to complete. Block2 has indeed been placed on the queue, but it will never be executed. Block1 is "waiting" for block2 to complete, but block2 is sitting on a queue, and the code that pulls it off the queue and executes it will not run until block1 completes.
Deadlock from NOT reentering dispatch_sync
Now, what if we change the code to this...
block1 = {
dispatch_sync(queue, block2);
}
dispatch_async(queue, block1);
We are not technically reentering dispatch_sync. However, we still have the same scenario, it's just that the thread that kicked off block1 is not waiting for it to finish.
We are still running block1, waiting for block2 to finish, but the thread that will run block2 must finish with block1 first. This will never happen because the code to process block1 is waiting for block2 to be taken off the queue and executed.
Thus reentrancy for dispatch queues is not technically reentering the same function, but reentering the same queue processing.
Deadlocks from NOT reentering the queue at all
In it's most simple case (and most common), let's assume [self foo] gets called on the main thread, as is common for UI callbacks.
-(void) foo {
dispatch_sync(dispatch_get_main_queue(), ^{
// Never gets here
});
}
This doesn't "reenter" the dispatch queue API, but it has the same effect. We are running on the main thread. The main thread is where the blocks are taken off the main queue and processed. The main thread is currently executing foo and a block is placed on the main-queue, and foo then waits for that block to be executed. However, it can only be taken off the queue and executed after the main thread gets done with its current work.
This will never happen because the main thread will not progress until `foo completes, but it will never complete until that block it is waiting for runs... which will not happen.
In my understanding, you can have a deadlock using dispatch_sync only
if the thread you are running on is the same thread where the block is
dispatch into.
As the aforementioned example illustrates, that's not the case.
Furthermore, there are other scenarios that are similar, but not so obvious, especially when the sync access is hidden in layers of method calls.
Avoiding deadlocks
The only sure way to avoid deadlocks is to never call dispatch_sync (that's not exactly true, but it's close enough). This is especially true if you expose your queue to users.
If you use a self-contained queue, and control its use and target queues, you can maintain some control when using dispatch_sync.
There are, indeed, some valid uses of dispatch_sync on a serial queue, but most are probably unwise, and should only be done when you know for certain that you will not be 'sync' accessing the same or another resource (the latter is known as deadly embrace).
EDIT
Jody, Thanks a lot for your answer. I really understood all of your
stuff. I would like to put more points...but right now I cannot. 😢 Do
you have any good tips in order to learn this under the hood stuff? –
Lorenzo B.
Unfortunately, the only books on GCD that I've seen are not very advanced. They go over the easy surface level stuff on how to use it for simple general use cases (which I guess is what a mass market book is supposed to do).
However, GCD is open source. Here is the webpage for it, which includes links to their svn and git repositories. However, the webpage looks old (2010) and I'm not sure how recent the code is. The most recent commit to the git repository is dated Aug 9, 2012.
I'm sure there are more recent updates; but not sure where they would be.
In any event, I doubt the conceptual frameworks of the code has changed much over the years.
Also, the general idea of dispatch queues is not new, and has been around in many forms for a very long time.
Many moons ago, I spent my days (and nights) writing kernel code (worked on what we believe to have been the very first symmetric multiprocessing implementation of SVR4), and then when I finally breached the kernel, I spent most of my time writing SVR4 STREAMS drivers (wrapped by user space libraries). Eventually, I made it fully into user space, and built some of the very first HFT systems (though it wasn't called that back then).
The dispatch queue concept was prevalent in every bit of that. It's emergence as a generally available user space library is only a somewhat recent development.
Edit #2
Jody, thanks for your edit. So, to recap a serial dispatch queue is
not reentrant since it could produce an invalid state (a deadlock).
On the contrary, an reentrant function will not produce it. Am I right?
– Lorenzo B.
I guess you could say that, because it does not support reentrant calls.
However, I think I would prefer to say that the deadlock is the result of preventing invalid state. If anything else occurred, then either the state would be compromised, or the definition of the queue would be violated.
Core Data's performBlockAndWait
Consider -[NSManagedObjectContext performBlockAndWait]. It's non-asynchronous, and it is reentrant. It has some pixie dust sprinkled around the queue access so that the second block runs immediately, when called from "the queue." Thus, it has the traits I described above.
[moc performBlock:^{
[moc performBlockAndWait:^{
// This block runs immediately, and to completion before returning
// However, `dispatch_async`/`dispatch_sync` would deadlock
}];
}];
The above code does not "produce a deadlock" from reentrancy (but the API can't avoid deadlocks entirely).
However, depending on who you talk to, doing this can produce invalid (or unpredictable/unexpected) state. In this simple example, it's clear what's happening, but in more complicated parts it can be more insidious.
At the very least, you must be very careful about what you do inside a performBlockAndWait.
Now, in practice, this is only a real issue for main-queue MOCs, because the main run loop is running on the main queue, so performBlockAndWait recognizes that and immediately executes the block. However, most apps have a MOC attached to the main queue, and respond to user save events on the main queue.
If you want to watch how dispatch queues interact with the main run loop, you can install a CFRunLoopObserver on the main run loop, and watch how it processes the various input sources in the main run loop.
If you've never done that, it's an interesting and educational experiment (though you can't assume what you observe will always be that way).
Anyway, I generally try to avoid both dispatch_sync and performBlockAndWait.
I'm creating an NSManagedObjectContext in a private queue to handle data updates I take from files and/or services:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
Since I'm using a private queue, I don't fully understand the difference between performBlock: and performBlockAndWait: methods... To perform my data updates I'm currently doing this:
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
// Save context
[privateContext save:nil];
dispatch_async(dispatch_get_main_queue(), ^{
// Notify update to user
});
}];
In this case, my data updates are made synchronoulsy and sequentially, so I suppose that is the correct place to save the context, right? If I'm doing something wrong, I'd appreciate if you let me know. On the other hand, would this code be equivalent?:
[privateContext performBlockAndWait: ^{
// Parse files and/or call services and parse
// their responses
// Save context
[privateContext save:nil];
}];
// Notify update to user
Again I guess that is the correct place to save the context... what are the differences between both methods (if any, in this case)?
What if instead of performing synchronous service calls or files parsing, I need to perform asynchronous service calls? How would these data updates be managed?
Thanks in advance
You are correct in that anything you want to do with a MOC must be done within either performBlock or performBlockAndWait. Note that retain/release is thread safe for managed objects, so you don't have to be inside one of those blocks to retain/release reference counts on managed objects.
They both utilize a synchronous queue to process messages, which means that only one block will execute at a time. Well, that's almost true. See the descriptions of performBlockAndWait. In any event, the access to the MOC will be serialized such that only one thread is accessing the MOC at a time.
tl;dr Don't worry about the difference, and always use performBlock.
Factual Differences
There are a number of differences. I'm sure there are more, but here are the ones that I think are most important to understand.
Synchronous vs. Asynchronous
performBlock is asynchronous, in that it returns immediately, and the block is executed at some time in the future, on some undisclosed thread. All blocks given to the MOC via performBlock will execute in the order they were added.
performBlockAndWait is synchronous, in that the calling thread will wait until the block has executed before returning. Whether the block runs in some other thread, or runs in the calling thread is not all that important, and is an implementation detail that can't be trusted.
Note, however, that it could be implemented as "Hey, some other thread, go run this block. I'm gonna sit here doing nothing until you tell me it's done." Or, it could be implemented as "Hey, Core Data, give me a lock that prevents all those other blocks from running so I can run this block on my own thread." Or it could be implemented in some other way. Again, implementation detail, which could change at any time.
I'll tell you this though, the last time I tested it, performBlockAndWait executed the block on the calling thread (meaning the second option in the above paragraph). This is only really information to help you understand what is going on, and should not be relied upon in any way.
Reentrancy
performBlock is always asynchronous, and is thus not reentrant. Well, some may consider it reentrant, in that you can call it from within a block that was called with performBlock. However, if you do this, all calls to performBlock will return immediately, and the block will not execute until at least the currently executing block completely finishes its work.
[moc performBlock:^{
doSomething();
[moc performBlock:^{
doSomethingElse();
}];
doSomeMore();
}];
These functions will always be executed in this order:
doSomething()
doSomeMore()
doSomethingElse()
performBlockAndWait is always synchronous. Furthermore, it is also reentrant. Multiple calls will not deadlock. Thus, if you end up calling performBlockAndWait while you are inside a block that was being run as a result of another performBlockAndWait, then it's OK. You will get the expected behavior, in that the second call (and any subsequent calls) will not cause a deadlock. Furthermore, the second one will completely execute before it returns, as you would expect.
[moc performBlockAndWait:^{
doSomething();
[moc performBlockAndWait:^{
doSomethingElse();
}];
doSomeMore();
}];
These functions will always be executed in this order:
doSomething()
doSomethingElse()
doSomeMore()
FIFO
FIFO stands for "First In First Out" which means that blocks will be executed in the order in which they were put into the internal queue.
performBlock always honors the FIFO structure of the internal queue. Every block will be inserted into the queue, and only run when it is removed, in FIFO order.
By definition, performBlockAndWait breaks FIFO ordering because it jumps the queue of blocks that have already been enqueued.
Blocks submitted with performBlockAndWait do not have to wait for other blocks that are running in the queue. There are a number of ways to see this. One simple one is this.
[moc performBlock:^{
doSomething();
[moc performBlock:^{
doSomethingElse();
}];
doSomeMore();
[moc performBlockAndWait:^{
doSomethingAfterDoSomethingElse();
}];
doTheLastThing();
}];
These functions will always be executed in this order:
doSomething()
doSomeMore()
doSomethingAfterDoSomethingElse()
doTheLastThing()
doSomethingElse()
It's obvious in this example, which is why I used it. Consider, however, if your MOC is getting stuff called on it from multiple places. Could be a bit confusing.
The point to remember though, is that performBlockAndWait is preemptive and can jump the FIFO queue.
Deadlock
You will never get a deadlock calling performBlock. If you do something stupid inside the block, then you could deadlock, but calling performBlock will never deadlock. You can call it from anywhere, and it will simply add the block to the queue, and execute it some time in the future.
You can easily get deadlocks calling performBlockAndWait, especially if you call it from a method that an external entity can call or indiscriminately within nested contexts. Specifically, you are almost guaranteed to deadlock your applications if a parent calls performBlockAndWait on a child.
User Events
Core Data considers a "user event" to be anything between calls to processPendingChanges. You can read the details of why this method is important, but what happens in a "user event" has implications on notifications, undo management, delete propagation, change coalescing, etc.
performBlock encapsulates a "user event" which means the block of code is automatically executed between distinct calls to processPendingChanges.
performBlockAndWait does not encapsulate a "user event." If you want the block to be treated as a distinct user event, you must do it yourself.
Auto Release Pool
performBlock wraps the block in its own autoreleasepool.
performBlockAdWait does not provide a unique autoreleasepool. If you need one, you must provide it yourself.
Personal Opinions
Personally, I do not believe there are very many good reasons to use performBlockAndWait. I'm sure someone has a use case that can't be accomplished in any other way, but I've yet to see it. If anyone knows of that use case, please share it with me.
The closest is calling performBlockAndWait on a parent context (don't ever do this on a NSMainConcurrencyType MOC because it could lock up your UI). For example, if you want to ensure that the database has completely saved to disk before the current block returns and other blocks get a chance to run.
Thus, a while ago, I decided to treat Core Data as a completely asynchronous API. As a result, I have a whole lot of core data code, and I do not have one single call to performBlockAndWait, outside of tests.
Life is much better this way. I have much fewer problems than I did back when I thought "it must be useful or they wouldn't have provided it."
Now, I simply no longer have any need for performBlockAndWait. As a result, maybe it has changed some, and I just missed it because it no longer interests me... but I doubt that.
When I send a performBlock message to my MOC of type NSPrivateQueueConcurrencyType, like this:
[self.privateManagedObjectContext performBlockAndWait:^{
if ([[NSThread currentThread] isMainThread]) {
NSLog(#"executing on the main thread!!");
}
…
}];
I find that, by default, this executes on the main thread. The conditional in the above code triggers, and the Issue Navigator indicates that execution is occurring on Thread 1 in the NSManagedObject Queue.
This is very puzzling to me, because Apple tells us that "each thread must have its own entirely private managed object context." Given that an MOC of type NSMainQueueConcurrencyType will use the main thread, doesn't it violate thread confinement for an MOC of type NSPrivateQueueConcurrencyType to use the main thread?
Is the execution of my code on the main thread normal? Have I misunderstood thread confinement? I understand that a queue is not necessarily tied to a particular thread, but in this case it seems the private MOC queue should at a minimum avoid the main thread, if not have a single go-to thread. I'm having some weird bugs, so I need to figure out what's going on. Thanks!
This optimization is possible because performBlockAndWait: executes the block
synchronously, i.e. the method does not return until the block has finished.
Therefore the block will not be executed in parallel with other operations on
the main thread.
(For the same reason, dispatch_sync(queue, ...) may execute a block on the main thread
instead of a separate thread.)
Utility.managedObjectContext().performBlockAndWait({
})
dispatch_sync(dispatch_get_main_queue(), {
})
Curious what is the difference between the two code above? context was created with .MainQueueConcurrencyType option.
If I perform blocks on the main queue, are queues executed in a FIFO order? Or can they overlap, operation mingle? I.e. (a1,a2,a3),(b1,b2,b3) can result (a1,b1,a2,a3,b2,b3)?
You are mixing two entirely different concepts here, but since it is the main thread/context/queue, your mix is masked and it "works".
Managed object context's performBlockAndWait: and performBlock: methods do not make any guarantees on which thread the block is executed, only that data accessed/mutated is safely accessed. Since your context is of main queue concurrency type, it is the exception in that it is safe to touch its objects outside of the performBlockAndWait: and performBlock: methods, on the main thread only. So when you queue your block to run on the main queue, it is guaranteed to run on the main thread, and thus your data is safe.
Block execution on the main thread is not atomic. Otherwise, what is the point of multithreading? To ensure data safety, you must performBlockAndWait: and performBlock: methods are called when accessing data. You are guaranteed that main queue scheduled blocks will run uninterrupted by other main queue scheduled blocks, and managed object context queues (background or main) are serial, so only one block will be allowed to concurrently access data.
I'm experimenting with AFIncrementalStore, which is awesome, but I'm noticing that I am having some performance issues.
Specifically, I'm using it to bring down a bunch of facebook friends info from the facebook graph api, and I'm seeing some pretty slow clocktimes for save operations. For context, I'm loading in about 900 records. Instruments is telling me that the problem line is this:
NSManagedObjectID *backingObjectID = [self objectIDForBackingObjectForEntity:entity withResourceIdentifier:resourceIdentifier];
which in turn calls this
[backingContext performBlockAndWait:^{
backingObjectID = [[backingContext executeFetchRequest:fetchRequest error:&error] lastObject];
}];
Has anyone had any experience with using AFIncremental store with larger data sets?
Somethign else I don't quite understand is why all this action is happening on the main thread when its all getting kicked off using a performBlockAndWait operation from a context with PrivateQueueConcurrencyType. Any help greatly appreciated!
Just a partial answer:
performBlockAndWait: will execute the block on the private queue, but the calling thread will "appear to wait" until the block is finished. (note the "appear to wait", explained below).
The queue is the underlaying mechanism to synchronize access to the shared resources. That ensures that shared resources can be accessed concurrently.
Now, GCD can apply an optimization regarding selecting the thread used to drive the queue: if you dispatch synchronously GCD may choose to use the current thread which drives the private dedicated queue.
Note: blocks enqueued on a particular queue can be executed on any thread. Nonetheless, the "execution context" is the queue - which determines synchronization.
So, in other words performBlockAndWait: will appear as if it will be called synchronously. If the block will be executed on the same thread, this thread will not block. It just switches to the private queue when executing the block (and thereby guaranties shared access). This makes sense as the name of the message indicates: "..AndWait".