I created an NSOperationQueue and set the maxConcurrentOperationCount property to 2.
If I create 2 operations that do not stop, when I continue to add operations to it, the NSOperationQueue will cache these tasks, so what is the maximum number of operations that can be cached by the NSOperationQueue, will it cause a memory surge?
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
queue.maxConcurrentOperationCount = 2;
[queue addOperationWithBlock:^{
while (YES) {
NSLog(#"thread1 : %#",[NSThread currentThread]);
}
}];
[queue addOperationWithBlock:^{
while (YES) {
NSLog(#"thread2 : %#",[NSThread currentThread]);
}
}];
// this operation will wait
[queue addOperationWithBlock:^{
while (YES) {
NSLog(#"thread3 : %#",[NSThread currentThread]);
}
}];
above is my code, third operation will never run,
from what I understand, the queue will save these tasks, if you keep adding operations to it, the memory will keep going up,
Will NSOperationQueue handle this situation internally?
An operation queue will handle large number of operations without incident. It is one of the reasons we use operation queues, to gracefully handle constrained concurrency for a number of operations that exceed the maxConcurrentOperationCount.
Obviously, your particular example, with operations spinning indefinitely, is both inefficient (tying up two worker threads with computationally intensive process) and will prevent the third operation from ever starting. But if you changed the operations to something more practical (e.g., ones that finish in some finite period of time), the operation queue can gracefully handle a very large number of operations.
That is not to say that operation queues can be used recklessly. For example, one can easily create operation queue scenarios that suffer from thread explosion and exhaust the limited worker thread pool. Or if you had operations that individually used tons of memory, then eventually, if you had enough of those queued up, you could theoretically introduce memory issues.
But don’t worry about theoretical problems. If you have a practical example for which you are having a problem, then post that as a separate question. But, in answer to your question here, operation queues can handle many queued operation quite well.
Let us consider queuing 100,000 operations, each taking one second to finish:
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
queue.maxConcurrentOperationCount = 2;
for (NSInteger i = 0; i < 100000; i++) {
[queue addOperationWithBlock:^{
[NSThread sleepForTimeInterval:1];
NSLog(#"%ld", i);
}];
}
NSLog(#"done queuing");
The operation queue handles all of these operations, only running two at a time, perfectly well. It will take some memory (e.g. 80mb) to hold these 100,000 operations in memory at a given moment, but it handles it fine. Even at 1,000,000 operations, it works fine (but will take ~500mb of memory). Clearly, at some point you will run out of memory, but if you're thinking of something with this many operations, you should be considering other patterns, anyway.
There are obviously practical limitations.
Let us consider a degenerate example: Imagine you had a multi-gigabyte video file and you wanted to run some task on each frame of the video. It would be a poor design to add operations for each frame, up-front, passing it the contents of the relevant frame of the video (because you would effectively be trying to hold the entire video, frame-by-frame, in memory at one time).
But this is not an operation queue limitation. This is just a practical memory limitation. We would generally consider a different pattern. In my hypothetical example, I would consider dispatch_apply, known as concurrentPerform in Swift, that would simply load the relevant frame in a just-in-time manner.
Bottom line, just consider how much memory will be added for each operation added to the queue and use your own good judgment as to whether holding all of these in memory at the same time or not. If you have many operations, each holding large piece of data in memory, then consider other patterns.
Related
I have to perform a complex operation on a large number of files. Fortunately, enumeration order is not important and the jobs can be done in parallel without locking.
Does the platform provide a way to do this? For lack of a better API, I was thinking of:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSArray *paths = [[NSFileManager defaultManager] subpathsAtPath:folder];
[paths enumerateObjectsWithOptions:NSEnumerationConcurrent
usingBlock:^(NSString *path, NSUInteger idx, BOOL *stop) {
// Complex operation
}];
}];
Is there a better way?
Your current code puts one block on the global queue. So, that single block will run on a background thread and do all of the iteration and processing.
You want to do something a bit different to have your processing tasks run concurrently. You should really do the iteration on the main thread and add a block to the global queue on each iteration of the loop.
Better, create an NSOperation subclass. Put your logic there. Create an instance of the operation in the loop and add them to an operation queue. This is a higher level API and offers you options in adding dependencies, tailoring the maximum concurrency, checking the number of operations still to be completed, etc ...
Here's an approach you can consider. If you have (or may have) tens of thousands of files, instead of enumerating with enumerateObjectsWithOptions:usingBlock: you may want to enumerate the array manually in batches (let's say 100 elements each). When the current batch completes execution (you can use dispatch groups to check that) you start the next batch. With this approach you can avoid adding tens of thousands of blocks to the queue.
BTW I've deleted my previous answer, because it was wrong.
I'm looking for ways to speed up a lengthy calculation (with two nested for-loops), the results of which will be shown in a plot. I tried NSOperationQueue thinking that each of the inner for loops would run concurrently. But apparently that's not the case, at least in my implementation. If I remove the NSOperationQueue calls, I get my results in my plot, so I know the calculation is done properly.
Here's a code snippet:
NSInteger half_window, len;
len = [myArray length];
if (!len)
return;
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
half_window = 0.5 * (self.slidingWindowSize - 1);
numberOfPoints = len - 2 * half_window;
double __block minY = 0;
double __block maxY = 0;
double __block sum, y;
xPoints = (double *) malloc (numberOfPoints * sizeof(double));
yPoints = (double *) malloc (numberOfPoints * sizeof(double));
for ( NSUInteger i = half_window; i < (len - half_window); i++ )
{
[queue addOperationWithBlock: ^{
sum = 0.0;
for ( NSInteger j = -half_window; j <= half_window; j++ )
{
MyObject *mo = [myArray objectAtIndex: (i+j)];
sum += mo.floatValue;
}
xPoints[i - half_window] = (double) i+1;
y = (double) (sum / self.slidingWindowSize);
yPoints[i - half_window] = y;
if (y > maxY)
maxY = y;
if (y < minY)
minY = y;
}];
[queue waitUntilAllOperationsAreFinished];
}
// update my core-plot
self.maximumValueForXAxis = len;
self.minimumValueForYAxis = floor(minY);
self.maximumValueForYAxis = ceil(maxY);
[self setUpPlotSpaceAndAxes];
[graph reloadData];
// cleanup
free(xPoints);
free(yPoints);
Is there a way to make this execute any faster?
You are waiting for all operations in the queue to finish after adding each item.
[queue waitUntilAllOperationsAreFinished];
}
// update my core-plot
self.maximumValueForXAxis = len;
should be
}
[queue waitUntilAllOperationsAreFinished];
// update my core-plot
self.maximumValueForXAxis = len;
You are also setting sum variable to 0.0 in each operation queue block.
This looks odd:
for ( NSUInteger j = -half_window; j <= half_window; j++ )
Assuming half_window is positive then you're setting an unsigned int to a negative number. I suspect that this will generate a huge unsigned int which will fail the condition which means this loop never gets calculated.
However, this isn't the cause of your slowness.
Revised answer
Below, in my original answer, I address two types of performance improvement, (1) designing responsive UI by moving complicated calculations in the background; and (2) making complicated calculations perform more quickly by making them multi-threaded (but which is a little complicated so be careful).
In retrospect, I now realize that you're doing a moving average, so your performance hit by doing nested for loops can be completely eliminated, cutting the Gordian knot. Using pseudo code, you can do something like the following, which updates the sum by removing the first point and adding the next point as you go along (where n represents how many points you're averaging in your moving average, e.g. a 30 point moving average from your large set, n is 30):
double sum = 0.0;
for (NSInteger i = 0; i < n; i++)
{
sum += originalDataPoints[i];
}
movingAverageResult[n - 1] = sum / n;
for (NSInteger i = n; i < totalNumberOfPointsInOriginalDataSet; i++)
{
sum = sum - originalDataPoints[i - n] + originalDataPoints[i];
movingAverageResult[i] = sum / n;
}
That makes this a linear complexity problem, which should be much faster. You definitely do not need to break this into multiple operations that you add to some queue to try to make the algorithm run multi-threaded (e.g. which is great because you therefore bypass the complications I warn you of in my point #2 below). You can, though, wrap this whole algorithm as a single operation that you add to a dispatch/operation queue so it runs asynchronous of your user interface (my point #1 below) if you want.
Original answer
It's not entirely clear from your question what the performance issue is. There are two classes of performance issues:
User interface responsiveness: If you're concerned about the responsiveness of the UI, you should definitely eliminate the waitUntilAllOperationsAreFinished because that is, at the end of the day, making the calculation synchronous with respect to your UI. If you're trying to address responsiveness in the user interface, you might (a) remove the operation block inside the for loop; but then (b) wrap these two nested for loops inside a single block that you would add to your background queue. Looking at this at a high level, the code would end up looking like:
[queue addOperationWithBlock:^{
// do all of your time consuming stuff here with
// your nested for loops, no operations dispatched
// inside the for loop
// when all done
[[NSOperationQueue mainQueue] addOperationWithBlock:^{
// now update your UI
}];
}];
Note, do not have any waitUntilAllOperationsAreFinished call here. The goal in responsive user interfaces is to have it run asynchronously, and using waitUntil... method effectively makes it synchronous, the enemy of a responsive UI.
Or, you can use the GCD equivalent:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
// do all of your time consuming stuff here
// when all done
dispatch_async(dispatch_get_main_queue(), ^{
// now update your UI
});
});
Again, we're calling dispatch_async (which is equivalent to making sure you don't call waitUntilAllOperationsAreFinished) to make sure we dispatch this code to the background, but then immediately return so our UI remains responsive.
When you do this, the method that does this will return almost instantaneously, keeping the UI from stuttering/freezing during this operation. And when this operation is done, it will update the UI accordingly.
Note, this presumes that you're doing this all in a single operation, not submitting a bunch of separate background operations. You're just going to submit this single operation to the background, it's going to do its complicated calculations, and when it's done, it will update your user interface. In the mean time your user interface can continue to be responsive (let the user do other stuff, or if that doesn't make sense, show the user some UIActivityIndicatorView, a spinner, so they know the app is doing something special for them and that it will be right back).
The take-home message, though, is that anything that will freeze (even temporarily) a UI is a not a great design. And be forewarned that, and if your existing process takes long enough, the watchdog process may even kill your app. Apple's counsel is that, at the very least, if it takes more than a few hundred milliseconds, you should be doing it asynchronously. And if the UI is trying to do anything else at the same time (e.g. some animation, some scrolling view, etc.), even a few hundred milliseconds is far too long.
Optimizing performance by making the calculation, itself, multi-threaded: If you're trying to tackle the more fundamental performance issue of this by making multi-threaded, far more care must be taken regarding the how you do this.
First, you probably want to restrict the number of concurrent operations you have to some reasonable number (you never want to risk using up all of the available threads). I'd suggest you set maxConcurrentOperationCount to some small, reasonable number (e.g. 4 or 6 or something like that). You'll have diminishing returns at that point anyway, because the device only has a limited number of cores available.
Second, and just as important, you should to pay special attention to the synchronizing of your updating of variables outside of the operation (like your minY, maxY, etc.). Let's say maxY is currently 100 and you have two concurrent operations, one which is trying to set it to 300 and another that is trying to set it to 200. But if they both confirm that they're greater than the current value, 100, and proceed to set their values, if the one that is setting it to 300 happens to win the race, the other operation can reset it back to 200, blowing away your 300 value.
When you want to write concurrent code with separate operations updating the same variables, you have to think very carefully about synchronization of these external variables. See the Synchronization section of the Threading Programming Guide for a discussion of a variety of different locking mechanism that address this problem. Or you can define another dedicate serial queue for synchronizing the values as discussed in Eliminating Lock-Based Code of the Concurrency Programming Guide.
Finally, when thinking about synchronization, you can always step back and ask yourself whether the cost of doing all of this synchronization of these variables is really necessary (because there is a performance hit when synchronizing, even if you don't have contention issues). For example, although it might seem counter-intuitive, it might be faster to not try to update to minY and maxY at all during these operations, eliminating the need for synchronization. You could forgo the calculation of those two variables for the range of y values as the calculations are being done, but just wait until all of the operations are done and then do one final iteration through the entire result set and calculate the min and max then. This is an approach that you can verify empirically, where you might want to try it both with locks (or other synchronization method) and then again, calculating the range of values as a single operation at the very end where locks wouldn't be necessary. Surprisingly, sometimes adding the extra loop at the end (and thereby eliminating the need to synchronize) can be faster.
The bottom line is that you can't generally just take a serial piece of code and make it concurrent, without special attention to both of these considerations, constraining how many threads you'll consume and if you're going to be updating the same variables from multiple operations, consider how you're going to synchronize the values. And even if you decide to tackle this second issue, the multi-threaded calculation itself, you should still think about the first issue, the responsive UI, and perhaps marry both methods.
So I was wondering what the best way to break out long tasks into NSOperations. If I have 3 long running tasks, is it better to have one NSOperation subclass that basically does something like
Single NSOperation subclass
- (void)main {
// do long running task 1
// do long running task 2
// do long running task 3
// call back the delegate
}
Or is it better to have each task be a subclass of NSOperation, and then manage each task from my ViewController as a single unit of work? Thanks in advance.
It depends whether the operation queue is serial (i.e. max concurrent operations 1) or parallel, and what the nature of the work is. If the queue is serial, then it really doesn't matter. If the queue is parallel, then it depends on a bunch of factors:
is the work safe to do concurrently
does the work contend on a shared resource (such as network or disk IO, or a lock) that would remove the concurrency
is each unit of work sufficiently large to be worth the overhead of dispatching separately
(edit)
Also, if you don't need the advanced features of NSOperationQueue (operation dependencies and priorities, KVO, etc...), consider using dispatch queues instead. They're significantly lighter weight.
How to control and balance the number of threads my app is executing, how to limit their number to avoid app's blocking because thread limit is reached?
Here on SO I saw the following possible answer: "Main concurrent queue (dispatch_get_global_queue) manages the number of threads automatically" which I don't like for the following reason:
Consider the following pattern (in my real app there are both more simple and more complex examples):
dispatch_queue_t defaultBackgroundQueue() {
return dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
}
dispatch_queue_t databaseQueue() {
dispatch_queue_create("Database private queue", 0);
}
dispatch_async(defaultBackgroundQueue(), ^{
[AFNetworkingAsynchronousRequestWithCompletionHandler:^(data){
dispatch_async(databaseQueue(), ^{
// data is about 100-200 elements to parse
for (el in data) {
}
maybe more AFNetworking requests and/or processing in other queues or
dispatch_async(dispatch_get_main_queue(), ^{
// At last! We can do something on UI.
});
});
}];
});
This design very often leads to the situation when:
The app is locked because of threads limit is reached (something like > 64)
the slower and thus narrow queues can be overwhelmed with a large number of pending jobs.
the second one also can produce a cancellation problem - if we have 100 jobs already waiting for execution in a serial queue we can't cancel them at once.
The obvious and dumb solution would be to replace sensitive dispatch_async methods with dispatch_sync, but it is definitely the one I don't like.
What is recommended approach for this kind of situations?
I hope an answer more smart than just "Use NSOperationQueue - it can limit the number of concurrent operations" does exist (similar topic: Number of threads with NSOperationQueueDefaultMaxConcurrentOperationCount).
UPDATE 1: The only decent pattern is see: is to replace all dispatch_async's of blocks to concurrent queues with running these blocks wrapped in NSOperations in NSOperationQueue-based concurrent queues with max operations limit set (in my case maybe also set a max operations limit on the NSOperationQueue-based queue that AFNetworking run all its operations in).
You are starting too many network requests. AFAIK it's not documented anywhere, but you can run up to 6 simultaneous network connections (which is a sensible number considering RFC 2616 8.1.4, paragraph 6). After that you get locking, and GCD compensates creating more threads, which by the way, have a stack space of 512KB each with pages allocated on demand. So yes, use NSOperation for this. I use it to queue network requests, increase the priority when the same object is requested again, pause and serialize to disk if the user leaves. I also monitor the speed of the network requests in bytes/time and change the number of concurrent operations.
While I don't see from your example where exactly you're creating "too many" background threads, I'll just try to answer the question of how to control the exact number of threads per queue. Apple's documentation says:
Concurrent queues (also known as a type of global dispatch queue) execute one or more tasks concurrently, but tasks are still started in the order in which they were added to the queue. The currently executing tasks run on distinct threads that are managed by the dispatch queue. The exact number of tasks executing at any given point is variable and depends on system conditions.
While you can now (since iOS5) create concurrent queues manually, there is no way to control how many jobs will be run concurrently by such a queue. The OS will balance the load automatically. If, for whatever reason, you don't want that, you could for example create a set of n serial queues manually and dispatch new jobs to one of your n queues at random:
NSArray *queues = #[dispatch_queue_create("com.myapp.queue1", 0),dispatch_queue_create("com.myapp.queue2", 0),dispatch_queue_create("com.myapp.queue3", 0)];
NSUInteger randQueue = arc4random() % [queues count];
dispatch_async([queues objectAtIndex:randQueue], ^{
NSLog(#"Do something");
});
randQueue = arc4random() % [queues count];
dispatch_async([queues objectAtIndex:randQueue], ^{
NSLog(#"Do something else");
});
I'm by no means endorsing this design - I think concurrent queues are pretty good at balancing system resources. But since you asked, I think this is a feasible approach.
I have a method that performs a mathematical operation repeatedly (possibly millions on times) with different data. What is the best way to do this in iOs (it will run on iPad devices)? I understand that performSelectorOnBackgroundThread is deprecated... ? I also need to aggregate all the results in an NSArray . The best way seems to be: post a notification to the Notification Center and add the method as an observer. Is this correct? The array will need to be declared as atomic, I believe... Plus I will need to show a progress bar as the operations complete... How many threa can I start in parallel ? I don't think starting 1.000.000 threads is such a good idea on an iDevice..
Thanks in advance...
Look into Grand Central Dispatch, it's the preferred way to do multi-threading on iOS (and Mac).
A simple example of using GCD would look like:
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_async(queue, ^{
//do long running task here
}
This will execute a block asynchronously of the main thread. GCD has numerous other ways of dispatching tasks, one taken directly from the Wikipedia article listed above is:
dispatch_apply(count, dispatch_get_global_queue(0, 0), ^(size_t i){
results[i] = do_work(data, i);
});
total = summarize(results, count);
This particular code sample is probably exactly what you're looking for, assuming this "large task" of yours is a embarrassingly parallel.
While you could use dispatch_apply() and spin off all of the runs simultaneously, that'll end up being slower.
You'll want to be able to throttle the # of runs in flight simultaneously with the # of simultaneous computations being something that you'll need to tune.
I've often used a dispatch_semaphore_t to allow for easy tuning of the # of in-flight computations.
Details of doing so are in an answer here: https://stackoverflow.com/a/4535110/25646