Is there any difference between dataWithContentsOfURL (threaded) and dataTaskWithURL? - ios

We're using dataWithContentsOfURL because it is, uh, simple...
NSData *datRaw = [NSData dataWithContentsOfURL:ur];
Now, of course, that will hang the main UI thread.
So we put it on another thread. We do that exactly thus,
-(void)performSearch:(NSString *)stuff then:(void(^)(void))after
{
dispatch_queue_t otherThread =dispatch_queue_create(nil,0);
dispatch_queue_t mainThread =dispatch_get_main_queue();
dispatch_async(otherThread,
^{
self.resultsRA = [self ... calls dataWithContentsOfURL ...];
dispatch_async(mainThread, ^{ if (after) after(); });
});
}
(Incidentally, here's an excellent introduction to that if needed https://stackoverflow.com/a/7291056/294884).
Well now, Apple says you should not use dataWithContentsOfURL, they say you should instead just use NSSession. So, dataTaskWithURL:completionHandler:
My question, is there any difference at all between making our own thread (i.e. with dataWithContentsOfURL) versus using dataTask ?
Are we wrong to use dataWithContentsOfURL: on a thread, for some reason? I appreciate it is more convenient, etc. I mean is there are real difference, any dangers, etc.

One reason to prefer true async io over threaded synchronous io is that threads are not free memory-wise. It's not a huge deal in general but you can save a little memory in your app and (more importantly) a little wired memory in the OS's kernel by not keeping a thread sitting around doing nothing while it waits.

Some of reasons I can see:
With synchronous request you can't know download progress and can't resume download. If you download big file and it fails on 99%, you will need to redownload whole file.
As Apple states "Do not use this synchronous method to request network-based URLs. For network-based URLs, this method can block the current thread for tens of seconds on a slow network...". If you are using GCD, you won't directly control the thread you are given and it may block some other important operations on that thread, dataTask scheduler may have better overview of system resources. If you create thread manually, you may overload system(if there is already resource strain at least) with this blocked thread.
Also there is "added ability to support custom authentication and cancellation" in dataTaskWithURL:.
You may need to customize request headers/body. Maybe it falls in "convenience" category, but anyway it's another thing.

Related

Performance issues with AFIncrementalStore

I'm experimenting with AFIncrementalStore, which is awesome, but I'm noticing that I am having some performance issues.
Specifically, I'm using it to bring down a bunch of facebook friends info from the facebook graph api, and I'm seeing some pretty slow clocktimes for save operations. For context, I'm loading in about 900 records. Instruments is telling me that the problem line is this:
NSManagedObjectID *backingObjectID = [self objectIDForBackingObjectForEntity:entity withResourceIdentifier:resourceIdentifier];
which in turn calls this
[backingContext performBlockAndWait:^{
backingObjectID = [[backingContext executeFetchRequest:fetchRequest error:&error] lastObject];
}];
Has anyone had any experience with using AFIncremental store with larger data sets?
Somethign else I don't quite understand is why all this action is happening on the main thread when its all getting kicked off using a performBlockAndWait operation from a context with PrivateQueueConcurrencyType. Any help greatly appreciated!
Just a partial answer:
performBlockAndWait: will execute the block on the private queue, but the calling thread will "appear to wait" until the block is finished. (note the "appear to wait", explained below).
The queue is the underlaying mechanism to synchronize access to the shared resources. That ensures that shared resources can be accessed concurrently.
Now, GCD can apply an optimization regarding selecting the thread used to drive the queue: if you dispatch synchronously GCD may choose to use the current thread which drives the private dedicated queue.
Note: blocks enqueued on a particular queue can be executed on any thread. Nonetheless, the "execution context" is the queue - which determines synchronization.
So, in other words performBlockAndWait: will appear as if it will be called synchronously. If the block will be executed on the same thread, this thread will not block. It just switches to the private queue when executing the block (and thereby guaranties shared access). This makes sense as the name of the message indicates: "..AndWait".

How to dispatch_after in the current queue?

Now that dispatch_get_current_queue is deprecated in iOS 6, how do I use dispatch_after to execute something in the current queue?
The various links in the comments don't say "it's better not to do it." They say you can't do it. You must either pass the queue you want or dispatch to a known queue. Dispatch queues don't have the concept of "current." Blocks often feed from one queue to another (called "targeting"). By the time you're actually running, the "current" queue is not really meaningful, and relying on it can (and historically did) lead to dead-lock. dispatch_get_current_queue() was never meant for dispatching; it was a debugging method. That's why it was removed (since people treated it as if it meant something meaningful).
If you need that kind of higher-level book-keeping, use an NSOperationQueue which tracks its original queue (and has a simpler queuing model that makes "original queue" much more meaningful).
There are several approaches used in UIKit that are appropriate:
Pass the call-back dispatch_queue as a parameter (this is probably the most common approach in new APIs). See [NSURLConnection setDelegateQueue:] or addObserverForName:object:queue:usingBlock: for examples. Notice that NSURLConnection expects an NSOperationQueue, not a dispatch_queue. Higher-level APIs and all that.
Call back on whatever queue you're on and leave it up to the receiver to deal with it. This is how callbacks have traditionally worked.
Demand that there be a runloop on the calling thread, and schedule your callbacks on the calling runloop. This is how NSURLConnection historically worked before queues.
Always make your callbacks on one of the well-known queues (particularly the main queue) unless told otherwise. I don't know of anywhere that this is done in UIKit, but I've seen it commonly in app code, and is a very easy approach most of the time.
Create a queue manually and dispatch both your calling code and your dispatch_after code onto that. That way you can guarantee that both pieces of code are run from the same queue.
Having to do this is likely because the need of a hack. You can hack around this with another hack:
id block = ^foo() {
[self doSomething];
usleep(delay_in_us);
[self doSomehingOther];
}
Instead of usleep() you might consider to loop in a run loop.
I would not recommend this "approach" though. The better way is to have some method which takes a queue as parameter and a block as parameter, where the block is then executed on the specified queue.
And, by the way, there are ways during a block executes to check whether it runs on a particular queue - respectively on any of its parent queue, provided you have a reference to that queue beforehand: use functions dispatch_queue_set_specific, and dispatch_get_specific.

#synchronized block versus GCD dispatch_async()

Essentially, I have a set of data in an NSDictionary, but for convenience I'm setting up some NSArrays with the data sorted and filtered in a few different ways. The data will be coming in via different threads (blocks), and I want to make sure there is only one block at a time modifying my data store.
I went through the trouble of setting up a dispatch queue this afternoon, and then randomly stumbled onto a post about #synchronized that made it seem like pretty much exactly what I want to be doing.
So what I have right now is...
// a property on my object
#property (assign) dispatch_queue_t matchSortingQueue;
// in my object init
_sortingQueue = dispatch_queue_create("com.asdf.matchSortingQueue", NULL);
// then later...
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
dispatch_async(_sortingQueue, ^{
// do stuff...
});
}
And my question is, could I just replace all of this with the following?
- (void)sortArrayIntoLocalStore:(NSArray*)matches
{
#synchronized (self) {
// do stuff...
};
}
...And what's the difference between the two anyway? What should I be considering?
Although the functional difference might not matter much to you, it's what you'd expect: if you #synchronize then the thread you're on is blocked until it can get exclusive execution. If you dispatch to a serial dispatch queue asynchronously then the calling thread can get on with other things and whatever it is you're actually doing will always occur on the same, known queue.
So they're equivalent for ensuring that a third resource is used from only one queue at a time.
Dispatching could be a better idea if, say, you had a resource that is accessed by the user interface from the main queue and you wanted to mutate it. Then your user interface code doesn't need explicitly to #synchronize, hiding the complexity of your threading scheme within the object quite naturally. Dispatching will also be a better idea if you've got a central actor that can trigger several of these changes on other different actors; that'll allow them to operate concurrently.
Synchronising is more compact and a lot easier to step debug. If what you're doing tends to be two or three lines and you'd need to dispatch it synchronously anyway then it feels like going to the effort of creating a queue isn't worth it — especially when you consider the implicit costs of creating a block and moving it over onto the heap.
In the second case you would block the calling thread until "do stuff" was done. Using queues and dispatch_async you will not block the calling thread. This would be particularly important if you call sortArrayIntoLocalStore from the UI thread.

Clarifications needed for concurrent operations, NSOperationQueue and async APIs

This is a two part question. Hope someone could reply with a complete answer.
NSOperations are powerful objects. They can be of two different types: non-concurrent or concurrent.
The first type runs synchronously. You can take advantage of a non-concurrent operations by adding them into a NSOperationQueue. The latter creates a thread(s) for you. The result consists in running that operation in a concurrent manner. The only caveat regards the lifecycle of such an operation. When its main method finishes, then it is removed form the queue. This is can be a problem when you deal with async APIs.
Now, what about concurrent operations? From Apple doc
If you want to implement a concurrent operation—that is, one that runs
asynchronously with respect to the calling thread—you must write
additional code to start the operation asynchronously. For example,
you might spawn a separate thread, call an asynchronous system
function, or do anything else to ensure that the start method starts
the task and returns immediately and, in all likelihood, before the
task is finished.
This is quite almost clear to me. They run asynchronously. But you must take the appropriate actions to ensure that they do.
What it is not clear to me is the following. Doc says:
Note: In OS X v10.6, operation queues ignore the value returned by
isConcurrent and always call the start method of your operation from a
separate thread.
What it really means? What happens if I add a concurrent operation in a NSOperationQueue?
Then, in this post Concurrent Operations, concurrent operations are used to download some HTTP content by means of NSURLConnection (in its async form). Operations are concurrent and included in a specific queue.
UrlDownloaderOperation * operation = [UrlDownloaderOperation urlDownloaderWithUrlString:url];
[_queue addOperation:operation];
Since NSURLConnection requires a loop to run, the author shunt the start method in the main thread (so I suppose adding the operation to the queue it has spawn a different one). In this manner, the main run loop can invoke the delegate included in the operation.
- (void)start
{
if (![NSThread isMainThread])
{
[self performSelectorOnMainThread:#selector(start) withObject:nil waitUntilDone:NO];
return;
}
[self willChangeValueForKey:#"isExecuting"];
_isExecuting = YES;
[self didChangeValueForKey:#"isExecuting"];
NSURLRequest * request = [NSURLRequest requestWithURL:_url];
_connection = [[NSURLConnection alloc] initWithRequest:request
delegate:self];
if (_connection == nil)
[self finish];
}
- (BOOL)isConcurrent
{
return YES;
}
// delegate method here...
My question is the following. Is this thread safe? The run loop listens for sources but invoked methods are called in a background thread. Am I wrong?
Edit
I've completed some tests on my own based on the code provided by Dave Dribin (see 1). I've noticed, as you wrote, that callbacks of NSURLConnection are called in the main thread.
Ok, but now I'm still very confusing. I'll try to explain my doubts.
Why including within a concurrent operation an async pattern where its callback are called in the main thread? Shunting the start method to the main thread it allows to execute callbacks in the main thread, and what about queues and operations? Where do I take advantage of threading mechanisms provided by GCD?
Hope this is clear.
This is kind of a long answer, but the short version is that what you're doing is totally fine and thread safe since you've forced the important part of the operation to run on the main thread.
Your first question was, "What happens if I add a concurrent operation in a NSOperationQueue?" As of iOS 4, NSOperationQueue uses GCD behind the scenes. When your operation reaches the top of the queue, it gets submitted to GCD, which manages a pool of private threads that grows and shrinks dynamically as needed. GCD assigns one of these threads to run the start method of your operation, and guarantees this thread will never be the main thread.
When the start method finishes in a concurrent operation, nothing special happens (which is the point). The queue will allow your operation to run forever until you set isFinished to YES and do the proper KVO willChange/didChange calls, regardless of the calling thread. Typically you'd make a method called finish to do that, which it looks like you have.
All this is fine and well, but there are some caveats involved if you need to observe or manipulate the thread on which your operation is running. The important thing to remember is this: don't mess with threads managed by GCD. You can't guarantee they'll live past the current frame of execution, and you definitely can't guarantee that subsequent delegate calls (i.e., from NSURLConnection) will occur on the same thread. In fact, they probably won't.
In your code sample, you've shunted start off to the main thread so you don't need to worry much about background threads (GCD or otherwise). When you create an NSURLConnection it gets scheduled on the current run loop, and all of its delegate methods will get called on that run loop's thread, meaning that starting the connection on the main thread guarantees its delegate callbacks also happen on the main thread. In this sense it's "thread safe" because almost nothing is actually happening on a background thread besides the start of the operation itself, which may actually be an advantage because GCD can immediately reclaim the thread and use it for something else.
Let's imagine what would happen if you didn't force start to run on the main thread and just used the thread given to you by GCD. A run loop can potentially hang forever if its thread disappears, such as when it gets reclaimed by GCD into its private pool. There's some techniques floating around for keeping the thread alive (such as adding an empty NSPort), but they don't apply to threads created by GCD, only to threads you create yourself and can guarantee the lifetime of.
The danger here is that under light load you actually can get away with running a run loop on a GCD thread and think everything is fine. Once you start running many parallel operations, especially if you need to cancel them midflight, you'll start to see operations that never complete and never deallocate, leaking memory. If you wanted to be completely safe, you'd need to create your own dedicated NSThread and keep the run loop going forever.
In the real world, it's much easier to do what you're doing and just run the connection on the main thread. Managing the connection consumes very little CPU and in most cases won't interfere with your UI, so there's very little to gain by running the connection completely in the background. The main thread's run loop is always running and you don't need to mess with it.
It is possible, however, to run an NSURLConnection connection entirely in the background using the dedicated thread method described above. For an example, check out JXHTTP, in particular the classes JXOperation and JXURLConnectionOperation

is there a way that the synchronized keyword doesn't block the main thread

Imagine you want to do many thing in the background of an iOS application but you code it properly so that you create threads (for example using GCD) do execute this background activity.
Now what if you need at some point to write update a variable but this update can occur or any of the threads you created.
You obviously want to protect that variable and you can use the keyword #synchronized to create the locks for you but here is the catch (extract from the Apple documentation)
The #synchronized() directive locks a section of code for use by a
single thread. Other threads are blocked until the thread exits the
protected code—that is, when execution continues past the last
statement in the #synchronized() block.
So that means if you synchronized an object and two threads are writing it at the same time, even the main thread will block until both threads are done writing their data.
An example of code that will showcase all this:
// Create the background queue
dispatch_queue_t queue = dispatch_queue_create("synchronized_example", NULL);
// Start working in new thread
dispatch_async(queue, ^
{
// Synchronized that shared resource
#synchronized(sharedResource_)
{
// Write things on that resource
// If more that one thread access this piece of code:
// all threads (even main thread) will block until task is completed.
[self writeComplexDataOnLocalFile];
}
});
// won’t actually go away until queue is empty
dispatch_release(queue);
So the question is fairly simple: How to overcome this ? How can we securely add a locks on all the threads EXCEPT the main thread which, we know, doesn't need to be blocked in that case ?
EDIT FOR CLARIFICATION
As you some of you commented, it does seem logical (and this was clearly what I thought at first when using synchronized) that only two the threads that are trying to acquire the lock should block until they are both done.
However, tested in a real situation, this doesn't seem to be the case and the main thread seems to also suffer from the lock.
I use this mechanism to log things in separate threads so that the UI is not blocked. But when I do intense logging, the UI (main thread) is clearly highly impacted (scrolling is not as smooth).
So two options here: Either the background tasks are too heavy that even the main thread gets impacted (which I doubt), or the synchronized also blocks the main thread while performing the lock operations (which I'm starting reconsidering).
I'll dig a little further using the Time Profiler.
I believe you are misunderstanding the following sentence that you quote from the Apple documentation:
Other threads are blocked until the thread exits the protected code...
This does not mean that all threads are blocked, it just means all threads that are trying to synchronise on the same object (the _sharedResource in your example) are blocked.
The following quote is taken from Apple's Thread Programming Guide, which makes it clear that only threads that synchronise on the same object are blocked.
The object passed to the #synchronized directive is a unique identifier used to distinguish the protected block. If you execute the preceding method in two different threads, passing a different object for the anObj parameter on each thread, each would take its lock and continue processing without being blocked by the other. If you pass the same object in both cases, however, one of the threads would acquire the lock first and the other would block until the first thread completed the critical section.
Update: If your background threads are impacting the performance of your interface then you might want to consider putting some sleeps into the background threads. This should allow the main thread some time to update the UI.
I realise you are using GCD but, for example, NSThread has a couple of methods that will suspend the thread, e.g. -sleepForTimeInterval:. In GCD you can probably just call sleep().
Alternatively, you might also want to look at changing the thread priority to a lower priority. Again, NSThread has the setThreadPriority: for this purpose. In GCD, I believe you would just use a low priority queue for the dispatched blocks.
I'm not sure if I understood you correctly, #synchronize doesn't block all threads but only the ones that want to execute the code inside of the block. So the solution probably is; Don't execute the code on the main thread.
If you simply want to avoid having the main thread acquire the lock, you can do this (and wreck havoc):
dispatch_async(queue, ^
{
if(![NSThread isMainThread])
{
// Synchronized that shared resource
#synchronized(sharedResource_)
{
// Write things on that resource
// If more that one thread access this piece of code:
// all threads (even main thread) will block until task is completed.
[self writeComplexDataOnLocalFile];
}
}
else
[self writeComplexDataOnLocalFile];
});

Resources