Workaround on the threads limit in Grand Central Dispatch? - ios

With Grand Central Dispatch, one can easily perform time consuming task on non-main thread, avoid blocking the main thead and keep the UI responsive. Simply by using dispatch_async and perform the task on a global concurrent queue.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
// code
});
However, something sounds too good to be true like this one usually have their downside. After we use this a lot in our iOS app project, recently we found that there's a 64 thread limit on it. Once we hit the limit, the app will freeze / hang. By pausing the app with Xcode, we can see that the main thread is held by semaphore_wait_trap.
Googling on the web confirms that others are encountering this problem too but so far no solution found for this.
Dispatch Thread Hard Limit Reached: 64 (too many dispatch threads
blocked in synchronous operations)
Another stackoverflow question confirms that this problem occur when using dispatch_sync and dispatch_barrier_async too.
Question:
As the Grand Central Dispatch have a 64 threads limit, is there any workaround for this?
Thanks in advance!

Well, if you're bound and determined, you can free yourself of the shackles of GCD, and go forth and slam right up against the OS per-process thread limit using pthreads, but the bottom line is this: if you're hitting the queue-width limit in GCD, you might want to consider reevaluating your concurrency approach.
At the extremes, there are two ways you can hit the limit:
You can have 64 threads blocked on some OS primitive via a blocking syscall. (I/O bound)
You can legitimately have 64 runnable tasks all ready to rock at the same time. (CPU bound)
If you're in situation #1, then the recommended approach is to use non-blocking I/O. In fact, GCD has a whole bunch of calls, introduced in 10.7/Lion IIRC, that facilitate asynchronous scheduling of I/O and improve thread re-use. If you use the GCD I/O mechanism, then those threads won't be tied up waiting on I/O, GCD will just queue up your blocks (or functions) when data becomes available on your file descriptor (or mach port). See the documentation for dispatch_io_create and friends.
In case it helps, here's a little example (presented without warranty) of a TCP echo server implemented using the GCD I/O mechanism:
in_port_t port = 10000;
void DieWithError(char *errorMessage);
// Returns a block you can call later to shut down the server -- caller owns block.
dispatch_block_t CreateCleanupBlockForLaunchedServer()
{
// Create the socket
int servSock = -1;
if ((servSock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
DieWithError("socket() failed");
}
// Bind the socket - if the port we want is in use, increment until we find one that isn't
struct sockaddr_in echoServAddr;
memset(&echoServAddr, 0, sizeof(echoServAddr));
echoServAddr.sin_family = AF_INET;
echoServAddr.sin_addr.s_addr = htonl(INADDR_ANY);
do {
printf("server attempting to bind to port %d\n", (int)port);
echoServAddr.sin_port = htons(port);
} while (bind(servSock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)) < 0 && ++port);
// Make the socket non-blocking
if (fcntl(servSock, F_SETFL, O_NONBLOCK) < 0) {
shutdown(servSock, SHUT_RDWR);
close(servSock);
DieWithError("fcntl() failed");
}
// Set up the dispatch source that will alert us to new incoming connections
dispatch_queue_t q = dispatch_queue_create("server_queue", DISPATCH_QUEUE_CONCURRENT);
dispatch_source_t acceptSource = dispatch_source_create(DISPATCH_SOURCE_TYPE_READ, servSock, 0, q);
dispatch_source_set_event_handler(acceptSource, ^{
const unsigned long numPendingConnections = dispatch_source_get_data(acceptSource);
for (unsigned long i = 0; i < numPendingConnections; i++) {
int clntSock = -1;
struct sockaddr_in echoClntAddr;
unsigned int clntLen = sizeof(echoClntAddr);
// Wait for a client to connect
if ((clntSock = accept(servSock, (struct sockaddr *) &echoClntAddr, &clntLen)) >= 0)
{
printf("server sock: %d accepted\n", clntSock);
dispatch_io_t channel = dispatch_io_create(DISPATCH_IO_STREAM, clntSock, q, ^(int error) {
if (error) {
fprintf(stderr, "Error: %s", strerror(error));
}
printf("server sock: %d closing\n", clntSock);
close(clntSock);
});
// Configure the channel...
dispatch_io_set_low_water(channel, 1);
dispatch_io_set_high_water(channel, SIZE_MAX);
// Setup read handler
dispatch_io_read(channel, 0, SIZE_MAX, q, ^(bool done, dispatch_data_t data, int error) {
BOOL close = NO;
if (error) {
fprintf(stderr, "Error: %s", strerror(error));
close = YES;
}
const size_t rxd = data ? dispatch_data_get_size(data) : 0;
if (rxd) {
// echo...
printf("server sock: %d received: %ld bytes\n", clntSock, (long)rxd);
// write it back out; echo!
dispatch_io_write(channel, 0, data, q, ^(bool done, dispatch_data_t data, int error) {});
}
else {
close = YES;
}
if (close) {
dispatch_io_close(channel, DISPATCH_IO_STOP);
dispatch_release(channel);
}
});
}
else {
printf("accept() failed;\n");
}
}
});
// Resume the source so we're ready to accept once we listen()
dispatch_resume(acceptSource);
// Listen() on the socket
if (listen(servSock, SOMAXCONN) < 0) {
shutdown(servSock, SHUT_RDWR);
close(servSock);
DieWithError("listen() failed");
}
// Make cleanup block for the server queue
dispatch_block_t cleanupBlock = ^{
dispatch_async(q, ^{
shutdown(servSock, SHUT_RDWR);
close(servSock);
dispatch_release(acceptSource);
dispatch_release(q);
});
};
return Block_copy(cleanupBlock);
}
Anyway... back to the topic at hand:
If you're in situation #2, you should ask yourself, "Am I really gaining anything through this approach?" Let's say you have the most studly MacPro out there -- 12 cores, 24 hyperthreaded/virtual cores. With 64 threads, you've got an approx. 3:1 thread to virtual core ratio. Context switches and cache misses aren't free. Remember, we presumed that you weren't I/O bound for this scenario, so all you're doing by having more tasks than cores is wasting CPU time with context switches and cache thrash.
In reality, if your application is hanging because you've hit the queue width limit, then the most likely scenario is that you've starved your queue. You've likely created a dependency that reduces to a deadlock. The case I've seen most often is when multiple, interlocked threads are trying to dispatch_sync on the same queue, when there are no threads left. This is always fail.
Here's why: Queue width is an implementation detail. The 64 thread width limit of GCD is undocumented because a well-designed concurrency architecture shouldn't depend on the queue width. You should always design your concurrency architecture such that a 2 thread wide queue would eventually finish the job to the same result (if slower) as a 1000 thread wide queue. If you don't, there will always be a chance that your queue will get starved. Dividing your workload into parallelizable units should be opening yourself to the possibility of optimization, not a requirement for basic functioning. One way to enforce this discipline during development is to try working with a serial queue in places where you use concurrent queues, but expect non-interlocked behavior. Performing checks like this will help you catch some (but not all) of these bugs earlier.
Also, to the precise point of your original question: IIUC, the 64 thread limit is 64 threads per top-level concurrent queue, so if you really feel the need, you can use all three top level concurrent queues (Default, High and Low priority) to achieve more than 64 threads total. Please don't do this though. Fix your design such that it doesn't starve itself instead. You'll be happier. And anyway, as I hinted above, if you're starving out a 64 thread wide queue, you'll probably eventually just fill all three top level queues and/or run into the per-process thread limit and starve yourself that way too.

Related

pthread_cond_signal does not wake up the thread that waits

I am writing a program which creates a thread that prints 10 numbers. When it prints 5 of them, it waits and it is notifying the main thread and then it continues for the next 5 numbers
This is test.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <pthread.h>
#include <unistd.h>
int rem = 10;
int count = 5;
pthread_mutex_t mtx;
pthread_cond_t cond1;
pthread_cond_t cond2;
void *f(void *arg)
{
int a;
srand(time(NULL));
while (rem > 0) {
a = rand() % 100;
printf("%d\n",a);
rem--;
count--;
if (count==0) {
printf("time to wake main thread up\n");
pthread_cond_signal(&cond1);
printf("second thread waits\n");
pthread_cond_wait(&cond2, &mtx);
printf("second thread woke up\n");
}
}
pthread_exit(NULL);
}
int main()
{
pthread_mutex_init(&mtx, 0);
pthread_cond_init(&cond1, 0);
pthread_cond_init(&cond2, 0);
pthread_t tids;
pthread_create(&tids, NULL, f, NULL);
while(1) {
if (count != 0) {
printf("main: waiting\n");
pthread_cond_wait(&cond1, &mtx);
printf("5 numbers are printed\n");
printf("main: waking up\n");
pthread_cond_signal(&cond2);
break;
}
pthread_cond_signal(&cond2);
if (rem == 0) break;
}
pthread_join(tids, NULL);
}
The output of the program is:
main: waiting
//5 random numbers
time to wake main thread up
second thread waits
5 numbers are printed
main: waking up
Since I do pthread_cond_signal(&cond2);I thought that the thread will wake up and prints the rest numbers but this is not the case. Any ideas why? Thanks in advance.
Summary
The issues have been summarized in comments, or at least most of them. So as to put an actual answer on record, however:
Pretty much nothing about the program's use of shared variables and synchronization objects is correct. It's behavior is undefined, and the specific manifestation observed is just one of the more likely in a universe of possible behaviors.
Accessing shared variables
If two different threads access (read or write) the same non-atomic object during their runs, and at least one of the accesses is a write, then all accesses must be properly protected by synchronization actions.
There is a variety of these, too large to cover comprehensively in a StackOverflow answer, but among the most common is to use a mutex to guard access. In this approach, a mutex is created in the program and designated for protecting access to one or more shared variables. Each thread that wants to access one of those variables locks the mutex before doing so. At some later point, the thread unlocks the mutex, lest other threads be permanently blocked from locking the mutex themselves.
Example:
pthread_mutex_t mutex; // must be initialized before use
int shared_variable;
// ...
void *thread_one_function(void *data) {
int rval;
// some work ...
rval = pthread_mutex_lock(&mutex);
// check for and handle lock failure ...
shared_variable++;
// ... maybe other work ...
rval = pthread_mutex_unlock(&mutex);
// check for and handle unlock failure ...
// more work ...
}
In your program, the rem and count variables are both shared between threads, and access to them needs to be synchronized. You already have a mutex, and using it to protect accesses to these variables looks like it would be appropriate.
Using condition variables
Condition variables have that name because they are designed to support a specific thread interaction pattern: that one thread wants to proceed past a certain point only if a certain condition, which depends on actions performed by other threads, is satisfied. Such requirements arise fairly frequently. It is possible to implement this via a busy loop, in which the thread repeatedly tests the condition (with proper synchronization) until it is true, but this is wasteful. Condition variables allow such a thread to instead suspend operation until a time when it makes sense to check the condition again.
The correct usage pattern for a condition variable should be viewed as a modification and specialization of the busy loop:
the thread locks a mutex guarding the data on which the condition is to be computed;
the thread tests the condition;
if the condition is satisfied then this procedure ends;
otherwise, the thread waits on a designated condition variable, specifying the (same) mutex;
when the thread resumes after its wait, it loops back to (2).
Example:
pthread_cond_t cv; // must be initialized before use
void *thread_two_function(void *data) {
int rval;
// some work ...
rval = pthread_mutex_lock(&mutex);
// check for and handle lock failure ...
while (shared_variable < 5) {
rval = pthread_cond_wait(&cv, &mutex);
// check for and handle wait failure ...
}
// ... maybe other work ...
rval = pthread_mutex_unlock(&mutex);
// check for and handle unlock failure ...
// more work ...
}
Note that
the procedure terminates (at (3)) with the thread still holding the mutex locked. The thread has an obligation to unlock it, but sometimes it will want to perform other work under protection of that mutex first.
the mutex is automatically released while the thread is waiting on the CV, and reacquired before the thread returns from the wait. This allows other threads the opportunity to access shared variables protected by the mutex.
it is required that the thread calling pthread_cond_wait() have the specified mutex locked. Otherwise, the call provokes undefined behavior.
this pattern relies on threads to signal or broadcast to the CV at appropriate times to notify any then-waiting other threads that they might want to re-evaluate the condition for which they are waiting. That is not modeled in the examples above.
multiple CVs can use the same mutex.
the same CV can be used in multiple places and with different associated conditions. It makes sense to do this when all the conditions involved are affected by the same or related actions by other threads.
condition variables do not store signals. Only threads that are already blocked waiting for the specified CV are affected by a pthread_cond_signal() or pthread_cond_broadcast() call.
Your program
Your program has multiple problems in this area, among them:
Both threads access shared variables rem and count without synchronization, and some of the accesses are writes. The behavior of the whole program is therefore undefined. Among the common manifestations would be that the threads do not observe each other's updates to those variables, though it's also possible that things seem to work as expected. Or anything else.
Both threads call pthread_cond_wait() without holding the mutex locked. The behavior of the whole program is therefore undefined. "Undefined" means "undefined", but it is plausible that the UB would manifest as, for example, one or both threads failing to return from their wait after the CV is signaled.
Neither thread employs the standard pattern for CV usage. There is no clear associated condition for either one, and the threads definitely don't test one. That leaves an implied condition of "this CV has been signaled", but that is unsafe because it cannot be tested before waiting. In particular, it leaves open this possible chain of events:
The main thread blocks waiting on cond1.
The second thread signals cond1.
The main thread runs all the way at least through signaling cond2 before the second thread proceeds to waiting on cond2.
Once (3) occurs, the program cannot avoid deadlock. The main thread breaks from the loop and tries to join the second thread, meanwhile the second thread reaches its pthread_cond_wait() call and blocks awaiting a signal that will never arrive.
That chain of events can happen even if the issues called out in the previous points is corrected, and it could manifest exactly the observable behavior you report.

Prioritization of one thread to process a variable when it is shared by 2 threads and protected by pthread_mutex_lock

Could anyone please suggest how to optimize the application code below using pthread_mutex_lock?
Let me describe the situation:
I have 2 threads sharing a global shared memory variable. The variable shmPtr->status is protected with mutex lock in both functions. Although there is a sleep(1/2) inside the "for loop" in the task1 function, I can't access the shmPtr->status in task2 when required and have to wait until the "for loop" is finished in the task1 function. It takes around 50 seconds for the shmPtr->status to be available for the task2 function.
I am wondering why the task1 function is not releasing the mutex lock despite the sleep(1/2) line. I don't want to wait with processing the shmPtr->status in the task2 function. Please advice.
thr_id1 = pthread_create ( &p_thread1, NULL, (void *)execution_task1, NULL );
thr_id2 = pthread_create ( &p_thread2, NULL, (void *)execution_task2, NULL );
void execution_task1()
{
for(int i = 0;i < 100;i++)
{
//100 lines of application code running here
pthread_mutex_lock(&lock);
shmPtr->status = 1; //shared memory variable
pthread_mutex_unlock(&lock);
sleep(1/2);
}
}
void execution_task2()
{
//100 lines of application code running here
pthread_mutex_lock(&lock);
shmPtr->status = 0; //shared memory variable
pthread_mutex_unlock(&lock);
sleep(1/2);
}
Regards,
NK
I am wondering why the task1 function is not releasing the mutex lock even with an sleep(1/2).
There is no reason to think that the thread running execution_task1() in your example fails to release the mutex, though you would know for sure if you appropriately tested the return value of its pthread_mutex_unlock() call. Rather, the potential problem is that it reacquires the mutex before any other thread contending for it has an opportunity to acquire it.
It seems plausible that a call to sleep(1/2) is ineffective at preventing that. 1/2 is an integer division, evaluating to 0, so you are performing a sleep(0). That probably does not cause the calling thread to suspend at all, and it may not even cause the thread to yield the CPU.
More generally, sleep() is never a good solution for any thread-synchronization problem.
If you are running on a multi-core system, however, and maybe even if you aren't, then freezing out other threads by such a mechanism seems unlikely if the function really does execute a hundred lines of code between releasing the mutex and trying to lock it again. If that's what you think you see then look harder.
If you really do need to force a thread to allow others a chance to acquire the mutex, then you could perhaps set up a fair queueing system as described in Implementing a FIFO mutex in pthreads. For a case such as you dsecribe, however, with one long-running thread needing occasionally to yield to other, quicker tasks, you could consider introducing a condition variable on which that long-running thread can suspend, and an atomic counter by which it can determine whether it should do so:
#include <stdatomic.h>
// ...
pthread_cond_t yield_cv = PTHREAD_COND_INITIALIZER;
_Atomic unsigned int waiting_thread_count = ATOMIC_VAR_INIT(0);
void execution_task1() {
for (int i = 0; i < 100; i++) {
// ...
pthread_mutex_lock(&lock);
if (waiting_thread_count > 0) {
pthread_cond_wait(&yield_cv, &lock);
// spurrious wakeup ok
}
// ... critical section ...
pthread_mutex_unlock(&lock);
}
}
void execution_task2() {
// ...
waiting_thread_count += 1; // Atomic get & increment; safe w/o mutex
pthread_mutex_lock(&lock);
waiting_thread_count -= 1;
pthread_cond_signal(&yield_cv); // no problem if no-one is presently waiting
// ... critical section ...
pthread_mutex_unlock(&lock);
}
Using an atomic counter relieves the program from having to protect that counter with its own mutex, which could just shift the problem instead of solving it. That allows threads to use the counter to signal upcoming attempts to acquire the mutex. This intent is then visible to the other thread, so that it can suspend on the CV to allow the other to acquire the mutex.
The short-running threads then acknowledge acquiring the mutex by decrementing the counter. They must do so before releasing the mutex, else the long-running thread might cycle around, acquire the mutex, and read the counter before it is decremented, thus erroneously blocking on the CV when no additional signal can be expected.
Although CV's can be subject to spurrious wakeup, that does not present a serious problem for this approach. If the long-running thread wakes spurriously from it wait on the CV, then the worst that happens is that it performs one more iteration of its main loop, then waits again.

iOS select versus kqueue/kevent versus mach_wait_until scheduling

I have a thread that listens on a single UDP socket, but also needs to wake up once in a while to perform other tasks. These tasks are triggered by the passage of time, or by activity on other threads. My current design is to use select() timeout value as scheduling timer, and to write a packet to the socket (loopback) address when I need to wake it from another thread.
However, Apple documention says select() timeouts should not be used to wake up more than a few times per second. And, in practice, I find they may be delayed by 100 msec or more, whereas I would like 10-20 msec resolution. Are they just trying to discourage cpu intensive polling, or is there something wrong with using select() per se. Is there a better approach?
Would it help to replace select with kqueue/kevent? Or, create a dedicated scheduling thread, with mach_wait_until() to handle the timer, and then write to the socket to wake the net thread? Or, do all the work in the dedicated thread, and have the net thread queue incoming data to it?
Something bugs me about this approach. Why do you have anything at all happening on the select() thread?
If you need a thread dedicated to waiting for incoming packets, them make that thread a wait as much as possible.
while (1) {
int numsockets = select(…);
if (numsockets > 0) {
// Read data (only drain the socket)
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
// Process data
});
}
}
Then you can have your periodic tasks run using timer dispatch sources.
dispatch_source_t source = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0,
dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0));
dispatch_source_set_event_handler(source, ^{
// Periodic process data
});
uint64_t nsec = 0.001 * NSEC_PER_SEC;
dispatch_source_set_timer(source, dispatch_time(DISPATCH_TIME_NOW, nsec), nsec, 0);
dispatch_resume(source);
I don't know, but I've been told that you can even use a dispatch source to replace the select().

EFAULT or EDESTADDRREQ for multiple dispatch_io_writes in GCD queue

I'm trying to implement a file storage queue subsystem, as described in wwdc 2012 Asynchronous Design Patterns with Blocks, GCD, and XPC. I have a custom concurrent processing queue that formats the data and hands over the result to my custom concurrent storage queue. The storage queue then creates an dispatch_io_t channel for writing, splits the file (if I don't split the file I get memory issues with large data), and writes each chunk with a dispatch_io_write. Sometimes the write completes without errors, but often I get either EFAULT or EDESTADDRREQ for any one of the chunks
My general approach is outlined as follows:
dispatch_async(processingQueue, ^{
// prepare dispatch data for storageQueue
dispatch_async(storageQueue, ^{
dispatch_io_t writeChannel = dispatch_io_create_with_path(DISPATCH_IO_STREAM,
[pathString UTF8String],
O_RDWR|O_CREAT, // read-write, create if not exist
S_IRWXU|S_IRWXG|S_IRWXO, // set all permissions
storageQueue,
^(int error) {
});
__block size_t chunkSize = STORAGE_WRITE_CHUNK_SIZE;
__block off_t currentOffset = 0;
dispatch_io_set_high_water(writeChannel, chunkSize);
for (currentOffset = 0; currentOffset < imageDataSize; currentOffset += chunkSize) {
// is dispatch_barrier required?
dispatch_data_t blockData = dispatch_data_create_subrange(dictData,
currentOffset,
MIN(imageDataSize - currentOffset,chunkSize));
if (dispatch_data_get_size(blockData) > 0) {
dispatch_io_write(writeChannel,
currentOffset,
blockData,
storageQueue,
^(bool done, dispatch_data_t data, int error){
});
} else {
NSLog(#"Error chunking data for writing!!");
break;
}
}
dispatch_io_close(writeChannel,0);
});
});
I have tried using serial queues, wrapping in dispatch_barrier blocks, modifying the dispatch_io channel flags, but the problem persists - frequent errors in dispatch_io_write due to bad address or destination
Would like to have some answers to:
1. What is the recommended way of doing buffered file writing using dispatch_io_write and a custom concurrent queue?
2. Do I need to use dispatch_barrier, and if so, dispatch_io_barrier of the channel or dispatch_barrier_async of the queue?
3. Are there any channel flags that I should be aware of in such a scenario?
I don't see anything obviously wrong in your snippet. There should be no need to chunk the data manually and submit multiple write operations (dispatch IO already does this internally more efficiently), and I have never heard of those specific IO errors being encountered before.
Please file a bug report with a runnable testcase that reproduces the issue.
In case using DISPATCH_IO_STREAM flag you create stream-based channel, that don't support random access, any offset value (currentOffset) is ignored and data is written at the current file location (0 in you snippet) as described this.

POSIX threading on ios

I've started experimenting with POSix threads using the ios platform. Coming fro using NSThread it's pretty daunting.
Basically in my sample app I have a big array filled with type mystruct. Every so often (very frequently) I want to perform a task with the contents of one of these structs in the background so I pass it to detachnewthread to kick things off.
I think I have the basics down but Id like to get a professional opinion before I attempt to move on to more complicated stuff.
Does what I have here seem "o.k" and could you point out anything missing that could cause problems? Can you spot any memory management issues etc....
struct mystruct
{
pthread thread;
int a;
long c;
}
void detachnewthread(mystruct *str)
{
// pthread_t thread;
if(str)
{
int rc;
// printf("In detachnewthread: creating thread %d\n", str->soundid);
rc = pthread_create(&str->thread, NULL, DoStuffWithMyStruct, (void *)str);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
//exit(-1);
}
}
//
/* Last thing that main() should do */
// pthread_exit(NULL);
}
void *DoStuffWithMyStruct(void *threadid)
{
mystruct *sptr;
dptr = (mystruct *)threadid;
// do stuff with data in my struct
pthread_detach(soundptr->thread);
}
One potential issue would be how the storage for the passed in structure mystruct is created. The lifetime of that variable is very critical to its usage in the thread. For example, if the caller of detachnewthread had that declared on the stack and then returned before the thread finished, it would be undefined behavior. Likewise, if it were dynamically allocated, then it is necessary to make sure it is not freed before the thread is finished.
In response to the comment/question: The necessity of some kind of mutex depends on the usage. For the sake of discussion, I will assume it is dynamically allocated. If the calling thread fills in the contents of the structure prior to creating the "child" thread and can guarantee that it will not be freed until after the child thread exits, and the subsequent access is read/only, then you would not need a mutex to protect it. I can imagine that type of scenario if the structure contains information that the child thread needs for completing its task.
If, however, more than one thread will be accessing the contents of the structure and one or more threads will be changing the data (writing to the structure), then you probably do need a mutex to protect it.
Try using Apple's Grand Central Dispatch (GCD) which will manage the threads for you. GCD provides the capability to dispatch work, via blocks, to various queues that are managed by the system. Some of the queue types are concurrent, serial, and of course the main queue where the UI runs. Based upon the CPU resources at hand, the system will manage the queues and necessary threads to get the work done. A simple example, which shows the how you can nest calls to different queues is like this:
__block MYClass *blockSelf=self;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[blockSelf doSomeWork];
dispatch_async(dispatch_get_main_queue(), ^{
[blockSelf.textField setStringValue:#"Some work is done, updating UI"];
});
});
__block MyClass *blockSelf=self is used simply to avoid retain cycles associated with how blocks work.
Apple's docs:
http://developer.apple.com/library/ios/#documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
Mike Ash's Q&A blog post:
http://mikeash.com/pyblog/friday-qa-2009-08-28-intro-to-grand-central-dispatch-part-i-basics-and-dispatch-queues.html

Resources