I have some event from C++ written library which works in background thread:
virtual void OnData(const char* data)
{
NSLog(#"Here 'data' string is present %s", data);
#autoreleasepool {
NSString* sData= [NSString stringWithCString:data encoding:NSUTF8StringEncoding];
dispatch_async(dispatch_get_main_queue(), ^{
NSLog(#"Here _sometimes_ 'data'(%s) is nil (\0). But sData is always present %#", data, sData);
[callback OnData:sData];
});
};
}
And sometimes I have NULL(I suspect its garbage actually) in dispatch_async block in argument variable. But local NSString variable is always here. Why?
P.S. Do I actually must use #autoreleasepool in this situation?
You have no assurances about the lifespan of the buffer that const char *data was pointing to by the time the async block is performed. The data could be dangling pointer by that point (and should be assumed to be so). It's very dangerous to use C-style pointers in any asynchronous references or outside the context they were originally created.
You should either use memory managed objects (e.g. NSData, NSString, etc.) or, if you insist on using C-style pointers and need to reference this pointer in the asynchronous block, copy the data to your own buffer, use that buffer, and then free it when you're done using that buffer in your asynchronous routine. In this case, you have your sData, so just don't refer to data after that point, and you'll be fine.
P.S. You later ask whether you must use #autoreleasepool in this situation.
In short, in most cases, no additional autorelease pool is needed. Notably, when using Grand Central Dispatch (e.g. dispatch_async), it has its own autorelease pools, so you don't have to create one. And, when your main thread yield back to its run loop, again, it's pool is drained. In short, you only need manually created autorelease pools when instantiating your own NSThread objects.
Having said that, sometimes you will introduce autorelease pools if doing significant memory intensive operations prior to yielding back to the run loop. In that case, you'll add autorelease pools in order to reduce the peak memory usage of the app. But this would not appear to be one of those cases.
If you had something like this:
void CallOnData()
{
char *test = malloc(5 * sizeof(char));
strcpy(test, "test");
OnData(test);
free(test);
}
You should expect data to be "NULL" in the block.
And autorelease is not needed, assuming you're using ARC, which you should be.
Related
I'm working on a NSData extension that encrypts data with a key, as shown below. I'm not too savvy with Objective-C, but would like to use it for this Cordova plugin, instead of requiring another plugin to bridge Swift files.
I'm curious if I need to do any work to ensure that all clean-up in my methods is happening, so that each time this method is called, there are no leaks.
When extending an NS object, does one need to wrap their methods in #autoreleasepool {}?
This is the method that encrypts data (NSData+AES256Encrypt.m):
- (NSData *)AES256EncryptWithKey:(NSString *)key {
// 'key' should be 32 bytes for AES256, will be null-padded otherwise
char keyPtr[kCCKeySizeAES256 + 1]; // room for terminator (unused)
bzero(keyPtr, sizeof(keyPtr)); // fill with zeros for padding
// Get key data
[key getCString:keyPtr maxLength:sizeof(keyPtr) encoding:NSUTF8StringEncoding];
NSUInteger dataLength = [self length];
size_t bufferSize = dataLength + kCCBlockSizeAES128;
void *buffer = malloc(bufferSize);
size_t numBytesEncrypted = 0;
CCCryptorStatus cryptStatus = CCCrypt(kCCEncrypt, kCCAlgorithmAES128,
kCCOptionPKCS7Padding, keyPtr,
kCCKeySizeAES256, NULL, [self bytes],
dataLength, buffer, bufferSize, &numBytesEncrypted);
if(cryptStatus == kCCSuccess) {
return [NSData dataWithBytesNoCopy:buffer length:numBytesEncrypted];
}
free(buffer);
return nil;
}
It is used in conjunction with NSString+AES256Encrypt.m:
- (NSString *)AES256EncryptWithKey:(NSString *)key {
NSData *plainData = [self dataUsingEncoding:NSUTF8StringEncoding];
NSData *encryptedData = [plainData AES256EncryptWithKey:key];
NSString *encryptedString = [encryptedData base64Encoding];
return encryptedString;
}
The line that concerns me is free(buffer) used in the first method posted above, called if cryptStatus is false (meaning it failed to encrypt). I also notice that method dataWithBytesNoCopy has a parameter freeWhenDone which:
If YES, the returned object takes ownership of the bytes pointer and frees it on deallocation.
But I'm not sure if it applies to my situation.
Thanks for any and all help.
I don't see any issues.
Autoreleasing is only for Objective-C objects. It basically delays the actual -release call a little bit by putting the object in an autorelease pool, which gets flushed after every iteration of the main run loop, calling -release on every object in the pool. Objects returned from methods are typically autoreleased, though ARC has a mechanism which can often avoid the overhead of the actual pool by figuring out that the value is only needed by the caller, and can keep track of the reference and just call -release there. In ARC mode, the compiler figures out for you when autorelease vs release is needed, and does not let you call those methods yourself.
Most of the time, you don't need your own autorelease pools, though if you are doing something in a loop where every iteration can create lots of temporary objects, you may want an autorelease_pool for every loop iteration so that memory does not get built up but rather gets cleaned up each time so the next iteration can re-use that memory. If you are writing a command-line program or some other tool which does not have its own Objective-C support, then yes you need an autorelease pool around the entry point at least.
C heap memory using malloc/free is outside the autorelease concept (which only pertains to the retain/release mechanism of NSObject). For every malloc(), you need to eventually call free() on that memory once it is no longer needed, otherwise it's a leak. The code above is correct -- there is a malloc(), and then either free() is called when returning nil, or the initWithBytesNoCopy: method is called (which is a special method which uses the passed-in bytes as the actual NSData storage, avoiding the overhead of a memory copy and further internal malloc, and will then call free() when the object itself is dealloced).
initWithBytesNoCopy:length: just calls -initWithBytesNoCopy:length:freeWhenDone: with a YES parameter for freeWhenDone, basically, per its documentation. You can call the longer method explicitly if you think it makes it more readable (as it does more clearly indicate you were aware of the free behavior), but it will work the same either way.
The NSData encrypt method really doesn't create any objects other than the one you are returning -- all of its code is more straight C code. So an autorelease pool would not help anything. The NSString encrypt method does create a few temporary objects, so if the amount of memory being encrypted is substantial, an autorelease pool around that may help if you have subsequent significant work (but be sure to have a strong reference to the object being returned outside the pool scope). And really, ARC will most likely figure out the temporary nature of most of those objects, and more efficiently deal with them than an autorelease pool anyways.
If you can profile your code using Instruments, you can see the memory usage of your program as it runs, and it would only be places with significant spikes that you would consider using a local autorelease pool.
I'm not well versed in objective-c, so forgive me if this is a stupid question. I've created a background thread to parse a list of files that exists in a directory, where the files in the directory can change at any time.
I call "contentsOfDirectoryAtPath" in every iteration of the loop, and my allocations suddenly go over 300mb. I can't figure out how to get ARC to release the returned array. Can someone maybe point me in the right direction here?
-(void) offlineModeThread
{
NSString *dataDir = [ViewController getDataDirectory];
NSFileManager *nfm = [NSFileManager defaultManager];
while(1)
{
NSArray *files = [nfm contentsOfDirectoryAtPath:dataDir error:nil];
/*
if(files == nil)
break;
if([files count] <= 0)
{
files = nil;
[NSThread sleepForTimeInterval: 5.0f];
continue;
}
if(![ViewController obtainLock])
{
files = nil;
continue;
}
*/
//[ViewController releaseLock];
files = nil;
}
}
As you can see, i've tried releasing the array by setting 'files' to nil, but it doesn't work.
how are you creating this thread? NSThread? NSOperation and NSOperationQueue? GCD?
For NSThread and NSOperation, you're supposed to create your own autorelease pool as part of setting up the thread. That way temporary objects get managed correctly within that thread.
I believe GCD takes care of this detail for you.
In newer versions of Objective C (Objective C 2.0, I believe) you should use the new syntax
#autoreleasepool
{
//Code to use a local autorelease pool
}
For loops that generate a ton of temporary objects, you can create a local autorelease pool for any block of code. You could refactor your code like this:
while(1)
{
#autoreleasepool
{
NSArray *files = [nfm contentsOfDirectoryAtPath:dataDir error:nil];
/*
if(files == nil)
break;
if([files count] <= 0)
{
files = nil;
[NSThread sleepForTimeInterval: 5.0f];
continue;
}
if(![ViewController obtainLock])
{
files = nil;
continue;
}
*/
//[ViewController releaseLock];
files = nil;
}
}
That would cause it to create a new autorelease pool for each iteration of the loop, and drain it at the end (thus releasing all temporary objects created in that iteration.)
By the way, if you're using NSThread, don't. Learn how to use GCD instead. GCD (Grand Central Dispatch) makes MUCH more efficient use of system resources than NSThread. NSThread is based on POSIX threads, which are very expensive to create, and tie up physical memory on the device for the life of your application.
NSOperationQueues have been refactored in recent OS versions to use GCD under the covers, so they are similarly efficient internally, although harder to use than GCD for most things.
Well, I think that the answer is obviously. Take a look in the documentation.
Many programs create temporary objects that are autoreleased. These
objects add to the program’s memory footprint until the end of the
block. In many situations, allowing temporary objects to accumulate
until the end of the current event-loop iteration does not result in
excessive overhead; in some situations, however, you may create a
large number of temporary objects that add substantially to memory
footprint and that you want to dispose of more quickly. In these
latter cases, you can create your own autorelease pool block. At the
end of the block, the temporary objects are released, which typically
results in their deallocation thereby reducing the program’s memory
footprint.
And you can see this question.
So the solution is using your #autoreleasepool{} because you are creating many temporary object.
If I have an autoreleased object and I need to provide it to a different thread, what is the best way to do so?
Let's say I have an object that is autoreleased in thread 0. I tell thread 1 about this object and it retains it because it needs it. Later then it's done, it releases it. No problem. When thread 0 runs again and empties its autorelease pool, it sees the retain count is 1 and because it's an autoreleased object it deallocs. Everything is fine, therefore threads don't matter. Right?
By the way this was originally an interview question. The interviewer insisted that an autoreleased object cannot be given to another thread. He seemed almost angry about it. More and more in tech interviews, I encounter ppl who believe they know everything.
You should not pass autoreleased object directly to other thread.
in this code
id _sharedVariable; // ivar
NSConditionLock *_lock;
- (void)thread1
{
id objectNeedToPass = [[NSObject new] autorelease];
[_lock lock];
_sharedVariable = objectNeedToPass;
[_lock unlockWithCondition:1];
}
- (void)thread2
{
while (true)
{
[_lock lockWithCondition:1];
id objectReceived = [_sharedVariable retain];
[_lock unlockWithCondition:0]
process(objectReceived );
[objectReceived release];
}
}
thread2 may see _sharedVariable hold a released object (and crash)
because it may do this
thread 1 create and autorelease object
thread 1 assign it to the shared variable
thread 1 release the object
object deallocated
thread 2 read the object
thread 2 retain the object - crash
to solve the problem, you should pass a retained object
id _sharedVariable; // ivar
NSConditionLock *_lock;
- (void)thread1
{
id objectNeedToPass = [[NSObject new] autorelease];
[_lock lock];
_sharedVariable = [objectNeedToPass retain];
[_lock unlockWithCondition:1];
}
- (void)thread2
{
while (true)
{
[_lock lockWithCondition:1];
id objectReceived = _sharedVariable;
[_lock unlockWithCondition:0]
process(objectReceived );
[objectReceived release];
}
}
however, this may cause memory leak if second thread failed to release the object and make code hard to maintain (retain/release are hard to balance)
There is nothing to worry about at all as long as you are following the normal Cocoa memory management rules. Every single way of "providing it to a different thread" will work fine as long as you are following the rules.
Pretty much any time you "provide something to a different thread", it is asynchronous (unless you are using locks to do synchronous cross-thread execution or something). Which means that the other thread may (and will likely) use it after the current function on this thread has gone out of scope. Any time you store an object that needs to outlive the current execution, it needs to be retained. If you are storing it in an instance variable or global variable directly, then you are responsible for retaining it, according to the memory management rules. If you are storing it in some kind of container object, then that object is responsible for retaining it. So pretty much if you follow the rules, there is nothing to worry about.
Let's consider a common way that people execute things on another thread, with -performSelector:onThread:withObject:waitUntilDone:. If waitUntilDone is false, this function stores the receiver, selector, and argument in some kind of object to wait until the other thread is ready to execute it. Therefore, this function must be responsible for retaining the receiver and object when it places it into this structure, and releasing it when the structure is destroyed. And indeed it does -- if you read the pre-ARC documentation for the method, it says "This method retains the receiver and the arg parameter until after the selector is performed."
So basically the memory management rules are sufficient -- if you store the object in an instance variable, you need to retain it. If you pass it to some other function, then it's their job to take care of it.
Don't. Pass an owning reference to the other thread. The other thread will take ownership of the object and release it when done with it.
With autoreleased objects, you can't tell when the sending threads autorelease pool will be drained, and can't be sure if it will be drained before the receiving thread gets it.
With reference to this answer, I am wondering is this correct?
#synchronized does not make any code "thread-safe"
As I tried to find any documentation or link to support this statement, for no success.
Any comments and/or answers will be appreciated on this.
For better thread safety we can go for other tools, this is known to me.
#synchronized does make code thread safe if it is used properly.
For example:
Lets say I have a class that accesses a non thread safe database. I don't want to read and write to the database at the same time as this will likely result in a crash.
So lets say I have two methods. storeData: and readData on a singleton class called LocalStore.
- (void)storeData:(NSData *)data
{
[self writeDataToDisk:data];
}
- (NSData *)readData
{
return [self readDataFromDisk];
}
Now If I were to dispatch each of these methods onto their own thread like so:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] storeData:data];
});
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[[LocalStore sharedStore] readData];
});
Chances are we would get a crash. However if we change our storeData and readData methods to use #synchronized
- (void)storeData:(NSData *)data
{
#synchronized(self) {
[self writeDataToDisk:data];
}
}
- (NSData *)readData
{
#synchronized(self) {
return [self readDataFromDisk];
}
}
Now this code would be thread safe. It is important to note that if I remove one of the #synchronized statements however the code would no longer be thread safe. Or if I were to synchronize different objects instead of self.
#synchronized creates a mutex lock on the object you are syncrhonizing. So in other words if any code wants to access code in a #synchronized(self) { } block it will have to get in line behind all previous code running within in that same block.
If we were to create different localStore objects, the #synchronized(self) would only lock down each object individually. Does that make sense?
Think of it like this. You have a whole bunch of people waiting in separate lines, each line is numbered 1-10. You can choose what line you want each person to wait in (by synchronizing on a per line basis), or if you don't use #synchronized you can jump straight to the front and skip all the lines. A person in line 1 doesn't have to wait for a person in line 2 to finish, but the person in line 1 does have to wait for everyone in front of them in their line to finish.
I think the essence of the question is:
is the proper use of synchronize able to solve any thread-safe
problem?
Technically yes, but in practice it's advisable to learn and use other tools.
I'll answer without assuming previous knowledge.
Correct code is code that conforms to its specification. A good specification defines
invariants constraining the state,
preconditions and postconditions describing the effects of the operations.
Thread-safe code is code that remains correct when executed by multiple threads. Thus,
No sequence of operations can violate the specification.1
Invariants and conditions will hold during multithread execution without requiring additional synchronization by the client2.
The high level takeaway point is: thread-safe requires that the specification holds true during multithread execution. To actually code this, we have to do just one thing: regulate the access to mutable shared state3. And there are three ways to do it:
Prevent the access.
Make the state immutable.
Synchronize the access.
The first two are simple. The third one requires preventing the following thread-safety problems:
liveness
deadlock: two threads block permanently waiting for each other to release a needed resource.
livelock: a thread is busy working but it's unable to make any progress.
starvation: a thread is perpetually denied access to resources it needs in order to make progress.
safe publication: both the reference and the state of the published object must be made visible to other threads at the same time.
race conditions A race condition is a defect where the output is dependent on the timing of uncontrollable events. In other words, a race condition happens when getting the right answer relies on lucky timing. Any compound operation can suffer a race condition, example: “check-then-act”, “put-if-absent”. An example problem would be if (counter) counter--;, and one of several solutions would be #synchronize(self){ if (counter) counter--;}.
To solve these problems we use tools like #synchronize, volatile, memory barriers, atomic operations, specific locks, queues, and synchronizers (semaphores, barriers).
And going back to the question:
is the proper use of #synchronize able to solve any thread-safe
problem?
Technically yes, because any tool mentioned above can be emulated with #synchronize. But it would result in poor performance and increase the chance of liveness related problems. Instead, you need to use the appropriate tool for each situation. Example:
counter++; // wrong, compound operation (fetch,++,set)
#synchronize(self){ counter++; } // correct but slow, thread contention
OSAtomicIncrement32(&count); // correct and fast, lockless atomic hw op
In the case of the linked question you could indeed use #synchronize, or a GCD read-write lock, or create a collection with lock stripping, or whatever the situation calls for. The right answer depend on the usage pattern. Any way you do it, you should document in your class what thread-safe guarantees are you offering.
1 That is, see the object on an invalid state or violate the pre/post conditions.
2 For example, if thread A iterates a collection X, and thread B removes an element, execution crashes. This is non thread-safe because the client will have to synchronize on the intrinsic lock of X (synchronize(X)) to have exclusive access. However, if the iterator returns a copy of the collection, the collection becomes thread-safe.
3 Immutable shared state, or mutable non shared objects are always thread-safe.
Generally, #synchronized guarantees thread safety, but only when used correctly. It is also safe to acquire the lock recursively, albeit with limitations I detail in my answer here.
There are several common ways to use #synchronized wrong. These are the most common:
Using #synchronized to ensure atomic object creation.
- (NSObject *)foo {
#synchronized(_foo) {
if (!_foo) {
_foo = [[NSObject alloc] init];
}
return _foo;
}
}
Because _foo will be nil when the lock is first acquired, no locking will occur and multiple threads can potentially create their own _foo before the first completes.
Using #synchronized to lock on a new object each time.
- (void)foo {
#synchronized([[NSObject alloc] init]) {
[self bar];
}
}
I've seen this code quite a bit, as well as the C# equivalent lock(new object()) {..}. Since it attempts to lock on a new object each time, it will always be allowed into the critical section of code. This is not some kind of code magic. It does absolutely nothing to ensure thread safety.
Lastly, locking on self.
- (void)foo {
#synchronized(self) {
[self bar];
}
}
While not by itself a problem, if your code uses any external code or is itself a library, it can be an issue. While internally the object is known as self, it externally has a variable name. If the external code calls #synchronized(_yourObject) {...} and you call #synchronized(self) {...}, you may find yourself in deadlock. It is best to create an internal object to lock upon that is not exposed outside of your object. Adding _lockObject = [[NSObject alloc] init]; inside your init function is cheap, easy, and safe.
EDIT:
I still get asked questions about this post, so here is an example of why it is a bad idea to use #synchronized(self) in practice.
#interface Foo : NSObject
- (void)doSomething;
#end
#implementation Foo
- (void)doSomething {
sleep(1);
#synchronized(self) {
NSLog(#"Critical Section.");
}
}
// Elsewhere in your code
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
Foo *foo = [[Foo alloc] init];
NSObject *lock = [[NSObject alloc] init];
dispatch_async(queue, ^{
for (int i=0; i<100; i++) {
#synchronized(lock) {
[foo doSomething];
}
NSLog(#"Background pass %d complete.", i);
}
});
for (int i=0; i<100; i++) {
#synchronized(foo) {
#synchronized(lock) {
[foo doSomething];
}
}
NSLog(#"Foreground pass %d complete.", i);
}
It should be obvious to see why this happens. Locking on foo and lock are called in different orders on the foreground VS background threads. It's easy to say that this is bad practice, but if Foo is a library, the user is unlikely to know that the code contains a lock.
#synchronized alone doesn't make code thread safe but it is one of the tools used in writing thread safe code.
With multi-threaded programs, it's often the case of a complex structure that you want to be maintained in a consistent state and you want only one thread to have access at a time. The common pattern is to use a mutex to protect a critical section of code where the structure is accessed and/or modified.
#synchronized is thread safe mechanism. Piece of code written inside this function becomes the part of critical section, to which only one thread can execute at a time.
#synchronize applies the lock implicitly whereas NSLock applies it explicitly.
It only assures the thread safety, not guarantees that. What I mean is you hire an expert driver for you car, still it doesn't guarantees car wont meet an accident. However probability remains the slightest.
It's companion in GCD(grand central dispatch) is dispatch_once. dispatch_once does the same work as to #synchronized.
The #synchronized directive is a convenient way to create mutex locks on the fly in Objective-C code.
side-effects of mutex locks:
deadlocks
starvation
Thread safety will depend on usage of #synchronized block.
I've started experimenting with POSix threads using the ios platform. Coming fro using NSThread it's pretty daunting.
Basically in my sample app I have a big array filled with type mystruct. Every so often (very frequently) I want to perform a task with the contents of one of these structs in the background so I pass it to detachnewthread to kick things off.
I think I have the basics down but Id like to get a professional opinion before I attempt to move on to more complicated stuff.
Does what I have here seem "o.k" and could you point out anything missing that could cause problems? Can you spot any memory management issues etc....
struct mystruct
{
pthread thread;
int a;
long c;
}
void detachnewthread(mystruct *str)
{
// pthread_t thread;
if(str)
{
int rc;
// printf("In detachnewthread: creating thread %d\n", str->soundid);
rc = pthread_create(&str->thread, NULL, DoStuffWithMyStruct, (void *)str);
if (rc){
printf("ERROR; return code from pthread_create() is %d\n", rc);
//exit(-1);
}
}
//
/* Last thing that main() should do */
// pthread_exit(NULL);
}
void *DoStuffWithMyStruct(void *threadid)
{
mystruct *sptr;
dptr = (mystruct *)threadid;
// do stuff with data in my struct
pthread_detach(soundptr->thread);
}
One potential issue would be how the storage for the passed in structure mystruct is created. The lifetime of that variable is very critical to its usage in the thread. For example, if the caller of detachnewthread had that declared on the stack and then returned before the thread finished, it would be undefined behavior. Likewise, if it were dynamically allocated, then it is necessary to make sure it is not freed before the thread is finished.
In response to the comment/question: The necessity of some kind of mutex depends on the usage. For the sake of discussion, I will assume it is dynamically allocated. If the calling thread fills in the contents of the structure prior to creating the "child" thread and can guarantee that it will not be freed until after the child thread exits, and the subsequent access is read/only, then you would not need a mutex to protect it. I can imagine that type of scenario if the structure contains information that the child thread needs for completing its task.
If, however, more than one thread will be accessing the contents of the structure and one or more threads will be changing the data (writing to the structure), then you probably do need a mutex to protect it.
Try using Apple's Grand Central Dispatch (GCD) which will manage the threads for you. GCD provides the capability to dispatch work, via blocks, to various queues that are managed by the system. Some of the queue types are concurrent, serial, and of course the main queue where the UI runs. Based upon the CPU resources at hand, the system will manage the queues and necessary threads to get the work done. A simple example, which shows the how you can nest calls to different queues is like this:
__block MYClass *blockSelf=self;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
[blockSelf doSomeWork];
dispatch_async(dispatch_get_main_queue(), ^{
[blockSelf.textField setStringValue:#"Some work is done, updating UI"];
});
});
__block MyClass *blockSelf=self is used simply to avoid retain cycles associated with how blocks work.
Apple's docs:
http://developer.apple.com/library/ios/#documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
Mike Ash's Q&A blog post:
http://mikeash.com/pyblog/friday-qa-2009-08-28-intro-to-grand-central-dispatch-part-i-basics-and-dispatch-queues.html