I realized that I was queuing a lot of blocks calling to empty methods. In the debugger it looks like a lot is happening when really all the blocks are empty.
Is there any real performance impact from having empty blocks?
The overhead should be negligible: You may check this with Instruments and a simple program like:
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
#autoreleasepool {
dispatch_queue_t q = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0);
void (^b)(void) = ^{ };
double d = 2.0;
for(int i = 0; i < 10000000; ++i) {
dispatch_sync(q, b);
d = d * 1.5 - 1.0;
}
NSLog(#"d = %.3f", d);
}
return 0;
}
As you can see in the Instruments stack trace the calls require 40ms for 10 millions synchronous invocations of an empty block. That's not much overhead.
Related
I am trying to see the race condition happens in the comsumer-producser problem,
so I made multiple producers and mulitple consumers.
From what I know that I need to provide mutex with semaphore:
Mutex for the race conditions, because muliple producers can access the buffer at the same time. then the data might be corrupted.
And semaphore to provide signaling between the producers and the consumers
The problem here that the sync is happening correctly while I am not using the Mutex (i am using the Semaphore only). is my understanding correct or is there anything wrong to do in the code below:
#include <pthread.h>
#include <stdio.h>
#include <semaphore.h>
#include <stdlib.h>
#include <unistd.h>
int buffer;
int loops = 0;
sem_t empty;
sem_t full;
sem_t mutex; //Adding MUTEX
void put(int value) {
buffer = value;
}
int get() {
int b = buffer;
return b;
}
void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&empty);
//sem_wait(&mutex);
put(i);
//printf("Data Set from %s, Data=%d\n", (char*) arg, i);
//sem_post(&mutex);
sem_post(&full);
}
}
void *consumer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&full);
//sem_wait(&mutex);
int b = get();
//printf("Data recieved from %s, %d\n", (char*) arg, b);
printf("%d\n", b);
//sem_post(&mutex);
sem_post(&empty);
}
}
int main(int argc, char *argv[])
{
if(argc < 2 ){
printf("Needs 2nd arg for loop count variable.\n");
return 1;
}
loops = atoi(argv[1]);
sem_init(&empty, 0, 1);
sem_init(&full, 0, 0);
sem_init(&mutex, 0, 1);
pthread_t pThreads[3];
pthread_t cThreads[3];
pthread_create(&cThreads[0], 0, consumer, (void*)"Consumer1");
pthread_create(&cThreads[1], 0, consumer, (void*)"Consumer2");
pthread_create(&cThreads[2], 0, consumer, (void*)"Consumer3");
//Passing the name of the thread as paramter, Ignore attr
pthread_create(&pThreads[0], 0, producer, (void*)"Producer1");
pthread_create(&pThreads[1], 0, producer, (void*)"Producer2");
pthread_create(&pThreads[2], 0, producer, (void*)"Producer3");
pthread_join(pThreads[0], NULL);
pthread_join(pThreads[1], NULL);
pthread_join(pThreads[2], NULL);
pthread_join(cThreads[0], NULL);
pthread_join(cThreads[1], NULL);
pthread_join(cThreads[2], NULL);
return 0;
}
I believe I have the problem figured out. Here's what is happening
When initializing your semaphores you set empty's number of threads to 1 and full's to 0
sem_init(&empty, 0, 1);
sem_init(&full, 0, 0);
sem_init(&mutex, 0, 1);
This means that there is only one "space" for the thread to get into the critical region. In other words, what your program is doing is
produce (empty is now 0, full has 1)
consume (full is now 0, empty has 0)
produce (empty is now 0, full has 1)
...
It's as if you had a token (or, if you like, a mutex), and you pass that token between consumers and producers. That is actually what the consumer-producer problem is all about, only that in most cases we are worried about having several consumers and producers working at the same time (which means you have more than one token). Here, because you have only one token, you basically have what one mutex would do.
Hope it helped :)
i have a problem with the pthread library in a C-Application for Linux.
In my Application a Thread is started over and over again.
But I allways wait until the Thread is finished before starting it.
At some point the thread doesn't start anymore and I get an out of memory error.
The solution I found is to do a pthread_join after the thread has finished.
Can anyone tell me why the Thread doesn't end correctly?
Here is an Example Code, that causes the same Problem.
If the pthread_join isn't called the Process stops at about 380 calls of the Thread:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <pthread.h>
#include <unistd.h>
volatile uint8_t check_p1 = 0;
uint32_t stack_start;
void *thread1(void *ch)
{
static int counter = 0;
int i;
int s[100000];
char stack_end;
srand(time(NULL) + counter);
for (i = 0; i < (sizeof (s)/sizeof(int)); i++) //do something
{
s[i] = rand();
}
counter++;
printf("Thread %i finished. Stacksize: %u\n", counter, ((uint32_t) (stack_start)-(uint32_t) (&stack_end)));
check_p1 = 1; // Mark Thread as finished
return 0;
}
int main(int argc, char *argv[])
{
pthread_t p1;
int counter = 0;
stack_start = (uint32_t)&counter; // save the Address of counter
while (1)
{
counter++;
check_p1 = 0;
printf("Start Thread %i\n", counter);
pthread_create(&p1, NULL, thread1, 0);
while (!check_p1) // wait until thread has finished
{
usleep(100);
}
usleep(1000); // wait a little bit to be really sure that the thread is finished
//pthread_join(p1,0); // crash without pthread_join
}
return 0;
}
The solution I found is to do a pthread_join after the thread has finished.
That is the correct solution. You must do that, or you leak thread resources.
Can anyone tell me why the Thread doesn't end correctly?
It does end correctly, but you must join it in order for the thread library to know: "yes, he is really done with this thread; no need to hold resources any longer".
This is exactly the same reason you must use wait (or waitpid, etc.) in this loop:
while (1) {
int status;
pid_t p = fork();
if (p == 0) exit(0); // child
// parent
wait(&status); // without this wait, you will run out of OS resources.
}
I have the following code in a loop (it gets called every 1/4 second).
(I have removed the rest of the code to narrow the problem down to the following).
dispatch_async(self.audioQueue, ^{
AudioBufferList *aacBufferList;
aacBufferList = malloc(sizeof(AudioBufferList));
aacBufferList->mNumberBuffers = 1;
aacBufferList->mBuffers[0].mNumberChannels = aacStreamFormat.mChannelsPerFrame;
aacBufferList->mBuffers[0].mDataByteSize = maxOutputPacketSize;
aacBufferList->mBuffers[0].mData = (void *)(calloc(maxOutputPacketSize, 1));
// Other code was here. As stated above, I have removed it to isolate the problem to the allocating and freeing of memory for the AudioBufferList
freeABL(aacBufferList);
}
And the freeABL function:
void freeABL(AudioBufferList *abl)
{
for (int i = 0; i < abl->mNumberBuffers; i++)
{
free(abl->mBuffers[i].mData);
abl->mBuffers[i].mData = NULL;
}
free(abl);
abl = NULL;
}
The problem I have, is every time this loops the memory consumption of my app increased, until I receive a memory warning.
Is this the correct way to sync threads without mutex.
This code should be running for a long time
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/memory_order.hpp>
#include <atomic>
std::atomic<long> x =0;
std::atomic<long> y =0;
boost::mutex m1;
// Thread increments
void Thread_Func()
{
for(;;)
{
// boost::mutex::scoped_lock lx(m1);
++x;
++y;
}
}
// Checker Thread
void Thread_Func_X()
{
for(;;)
{
// boost::mutex::scoped_lock lx(m1);
if(y > x)
{
// should never hit until int overflows
std::cout << y << "\\" << x << std::endl;
break;
}
}
}
//Test Application
int main(int argc, char* argv[])
{
boost::thread_group threads;
threads.create_thread(Thread_Func);
threads.create_thread(Thread_Func_X);
threads.join_all();
return 0;
}
Without knowing exactly what you're trying to do, it is hard to say it is the "correct" way. That's valid code, it's a bit janky though.
There is no guarantee that the "Checker" thread will ever see the condition y > x. It's theoretically possible that it will never break. In practice, it will trigger at some point but x might not be LONG_MIN and y LONG_MAX. In other words, it's not guaranteed to trigger just as the overflow happens.
I've seen it said multiple times that there is no way to limit a Lua script's memory usage, including people jumping through hoops to prevent Lua scripts from creating functions and tables. But given that lua_newstate allows you to pass a custom allocator, couldn't one just use that to limit memory consumption? At worst, one could use an arena-based allocator and put a hard limit even on the amount of memory that could be used by fragmentation.
Am I missing something here?
static void *l_alloc_restricted (void *ud, void *ptr, size_t osize, size_t nsize)
{
const int MAX_SIZE = 1024; /* set limit here */
int *used = (int *)ud;
if(ptr == NULL) {
/*
* <http://www.lua.org/manual/5.2/manual.html#lua_Alloc>:
* When ptr is NULL, osize encodes the kind of object that Lua is
* allocating.
*
* Since we don’t care about that, just mark it as 0.
*/
osize = 0;
}
if (nsize == 0)
{
free(ptr);
*used -= osize; /* substract old size from used memory */
return NULL;
}
else
{
if (*used + (nsize - osize) > MAX_SIZE) /* too much memory in use */
return NULL;
ptr = realloc(ptr, nsize);
if (ptr) /* reallocation successful? */
*used += (nsize - osize);
return ptr;
}
}
To make Lua use your allocator, you can use
int *ud = malloc(sizeof(int)); *ud = 0;
lua_State *L = lua_State *lua_newstate (l_alloc_restricted, ud);
Note: I haven't tested the source, but it should work.