Proper use of CFStringTokenizerCreate with ARC?

Proper use of CFStringTokenizerCreate with ARC? - ios

I have a piece of code that the ARC converter turned into this...
// firstRange is a NSRange obviously
// test is an NSString * passed in as parameter to the method
NSRange range = NSMakeRange(firstRange.location, (lastRange.location - firstRange.location) + lastRange.length);
NSString *sentence = [text substringWithRange:range];
// OK, now chop it up with the better parser
CFRange allTextRange = CFRangeMake(0, [sentence length]);
CFLocaleRef locale = CFLocaleCopyCurrent();
CFStringTokenizerRef tokenizer = CFStringTokenizerCreate(kCFAllocatorDefault,
(__bridge CFStringRef) sentence,
allTextRange,
kCFStringTokenizerUnitWord,
locale);
I call this A LOT and I suspect that it leaks somehow. Is that CFStringTokenizerCreate call kosher? I am especially suspicious of the __bridge call. Do I create an intermediate that I have to manually release or some such evil?

You need to CFRelease the tokenizer and locale or else they will leak.
This falls under Core Foundation Ownership Policy and has nothing to do with ARC.
The __bridge cast tells ARC that no ownership transfer is done for sentence in CFStringTokenizerCreate call. So that is Ok.
You can test for memory leaks with Xcode's static analyser and profiler.

You need to call CFRelease(tokenizer); when you are done using the tokenizer. See Ownership Policy. You should call CFRelease(locale); too.
Your __bridge sentence syntax is correct. I must say that Xcode is correct about __bridge and __bridge_transfer most of the time. In your case, you are passing a reference of NSObject for use with CF. You have no intention to transfer the ownership to CF because you think ARC is great at managing NSObjects. So when CFStringTokenizerCreate is done using sentence, it won't do anything to free it up. ARC will then free up sentence.
On the other hand, if you changed it to __bridge_transfer, you are telling ARC that you are transferring the ownership to CF. Therefore, when you are done, ARC won't free up sentence. You must call CFRelease(sentence); to free it up, which is not a desired behavior.

My gut tells me that it should be __bridge_transfer instead of bridge since you are calling create (unless there is a CFRelease call later). I also think the locale needs to be released since it is a copy.
EDIT Oops ignore me, I read it wrong (was using a phone)

For any Swift users reading this thread: the CFRelease() function does not seem to have been carried on into the Swift language, as Core Foundation objects are automatically memory managed (according to a compiler warning I'm seeing in Swift 3.0.2), so that's one less thing to think about.

Related

Want Autorelease lifetime with ARC

Transitioning to ARC on iOS.
I have an autoreleased NSString that I use to generate a UTF-8 representation, and rely on pool lifetime to keep the UTF-8 pointer alive:
char *GetStringBuffer(something)
{
NSString *ns = [NSString stringWithSomething:something];
return [ns UTF8String];
}
The nature of something is not important here.
Pre-ARC rules make sure the returned data pointer will stay valid for the lifetime of current autorelease pool. Crucially, I don't carry the NSString pointer around.
Now, under ARC, won't the string be released when the function returns? I don't think ARC will consider a char * to a structure deep inside an NSString a strong reference, especially seeing that it's not explicitly freed ever.
What's the best ARC idiom here?

If you want to guarantee that the return value of UTF8String is valid until the current autorelease pool is drained, you have two options:
Define GetStringBuffer in a file that is compiled with ARC disabled. If stringWithSomething: follows convention, it must return an autoreleased NSString to a non-ARC caller. If it doesn't (e.g. it acts like -[NSArray objectAtIndex:]), you can explicitly retain and autorelease it.
Use toll-free bridging and CFAutorelease:
char *GetStringBuffer(something) {
NSString *ns = [NSString stringWithSomething:something];
CFAutorelease(CFBridgingRetain(ns));
return [ns UTF8String];
}

(I can't delete this because it's accepted, but this answer is incorrect. See Rob Mayoff's answer, and the comments on that answer for an explanation. There is no promise that this pointer is valid past the return statement.)
Rewriting my whole answer. Had to dig and dig, but I believe this is surprisingly safe today (due to improvements in ARC since I had my crashes; I knew something like that was rolling around in the back of my head). Here's why:
#property (readonly) __strong const char *UTF8String NS_RETURNS_INNER_POINTER; // Convenience to return null-terminated UTF8 representation
UTF8String is marked NS_RETURNS_INNER_POINTER, which is really objc_returns_inner_pointer. Calling that method:
the object’s lifetime will be extended until at least the earliest of:
the last use of the returned pointer, or any pointer derived from it, in the calling function or
the autorelease pool is restored to a previous state.
Which is pretty much what you wanted it to do.

What is Apple warning against in ARC documentation caveat about pass by reference?

In Apple's documentation about ARC, they make a point of spelling out a problematic scenario in which ARC will generate a boilerplate temporary variable behind the scenes. Search on "The compiler therefore rewrites":
https://developer.apple.com/library/mac/releasenotes/ObjectiveC/RN-TransitioningToARC/Introduction/Introduction.html
The gist of the warning seems to be that because the stack-based variable is "strong" and the by-reference parameter to the called method (performOperationWithError:) is autoreleasing, ARC will generate a temporary local variable to serve the memory management needs of the autoreleasing variable. But because the temporary variable is assigned to the strong variable in the boilerplate example, it seems as though from a client's point of view there is no risk.
What exactly is the documentation taking pains to warn us about here? What is the risk as either a client or as an implementor of a method that may be called in this way (with an autoreleased, return-by-value parameter)?

It's only a warning about less than ideal performance. In the rewritten code, the NSError pointed to by "tmp" comes back autoreleased, is retained when assigned to "error", and then is released again when "error" goes out of scope.
If you change the declaration in the original code to:
NSError __autoreleasing *error;
If you do this, there is no assignment to a temp, and that implicit retain and then release no longer occurs. (The NSError object itself is still valid for exactly as long as it was before, since it is still in the autorelease pool.) So the documentation is warning you that if you use the "wrong" variable qualifier that it can cause extra retain count munging that wouldn't otherwise be required.
Also note that with either version of the code: Because the variable in question is passed by reference and isn't the return value from -performOperationWithError:, there isn't the opportunity to do the magic stack walking trick that ARC can do to save the object from going into the autorelease pool in the first place.

I think it’s to prevent confusion if you start looking at the values passed into the method. In their example, if I set a breakpoint on the line that calls [myObject performOperationWithError:&tmp]; and type p error, I’ll see the address of it. But if I step into -performOperationWithError: and type p error, I’ll get a different value—inside the method, error points to that temporary value.
I can see a situation where some poor sap is trying to debug something tricky with ARC where the pointer changing as it gets passed into the method would be an extremely confusing red herring.

My guess: If you made assumptions about the memory referenced by the output parameter, e.g indexing off the pointer, you might be surprised.

I don't think it has anything to do with the client. It looks like a reference to the same issue addressed in the WWDC 2013 video on memory issues: If you yourself implement a method that takes an autoreleasing indirection parameter (such as an NSError**), and if you create an autorelease pool block inside that method, do not assign to the NSError from inside the autorelease pool block. Instead, assign to a local variable, and then assign from the local to the NSError outside the autorelease pool block.

Seems to me that it's less of a warning about this behavior than a description of what the compiler does in this case and why you can pass the address of a strong local error reference to a method that is declared as wanting an __autoreleasing reference and not trigger a complaint.
You generally want an API to use __autoreleasing on such a parameter in case it is being used by either ARC or non-ARC code, as in non-ARC code it would be unusual to have to release such an output parameter.

The Apple documentation is referring to a compiler misfeature that will synthesize a temporary variable for you to deal with the conversion between __block and __autoreleasing. Sadly, this doesn't solve very many problems and it produces potentially disasterous unexpected results.
For example:
int main(int argc, char *argv[])
{
__block id value = #"initial value";
void (^block)(id *outValue) = ^(id *outValue){
value = #"hello";
};
block(&value);
NSLog(#"value = %#", value);
return 0;
}
With ARC, this reports:
2013-04-24 13:55:35.814 block-local-address[28013:707] value = initial value
but with MRR:
2013-04-24 13:57:26.058 block-local-address[28046:707] value = hello
This very often comes up when using NSFileCoordinator, causing you to lose the resulting NSError!
#import <Foundation/Foundation.h>
int main(int argc, char *argv[])
{
NSURL *fileURL = [NSURL fileURLWithPath:#"/tmp/foo"];
NSFileCoordinator *coordinator = [[NSFileCoordinator alloc] initWithFilePresenter:nil];
__block NSError *error;
[coordinator coordinateWritingItemAtURL:fileURL options:0 error:&error byAccessor:^(NSURL *newURL){
NSDictionary *userInfo = #{
NSLocalizedDescriptionKey : #"Testing bubbling an error out from a file coordination block."
};
error = [NSError errorWithDomain:NSPOSIXErrorDomain code:ENOSYS userInfo:userInfo];
}];
NSLog(#"error = %#", error);
}
When compiled with ARC, this results in a nil error!
This has been written up as a bug at llvm.org for a while, though I just changed the title to make it more clear that I'm suggesting the feature be ripped out. Also attached to that bug is a patch to add a new flag -fno-objc-arc-writeback to disable the feature).

iOS: Error and Crash by NSString with C malloc

Was testing some code and found an error with the following lines:
NSString *stringA = #"C99";
NSString *stringB = (__bridge id)malloc(sizeof (stringA));
It is not necessary to alloc a NSString this way, of course, and I am not required to do that. Again I was just testing on something else and I happened to stumble upon this.
The error reads:
Thread 1: EXC_BAD_ACCESS (code=1, address=0x20)
In the console:
(lldb)
To generalize, perhaps I should ask:
Could we alloc Objective-C objects through the use of malloc?
Has someone encountered this before (which I doubt, because I don't think anyone who uses Objective-C would alloc a NSString this way), but rather than shoving it aside and call it a day, I thought I would ask and see if someone knows what the exact cause of this is and why.

It is possible to use custom allocators for Objective-C objects. The problems with your code include:
NSString is a class cluster superclass (similar to an "abstract class") and cannot be instantiated on its own. You would need to use some concrete subclass of NSString. Note that the OS API does not provide any such class.
sizeof(stringA) is the size of the pointer variable, 4 or 8 bytes, which is too small to hold an NSString instance. You would need to use class_getInstanceSize() to compute the size.
+alloc performs work other than the allocation itself which is not present here. You would need to erase the memory and call objc_constructInstance().
ARC forbids the use of the low-level runtime functions that are needed to accomplish the above tasks.

well as far as I found the closest example of allocating NSSTring Clike is like this:
NSString* s4 = (NSString*)
CFStringCreateWithFormat(kCFAllocatorDefault, 0,
(CFStringRef) __builtin___CFStringMakeConstantString("%# %# (%#)"), s1, s2, s3);
ofcourse if you want to go lower and lower levels of this allocations , you should watch the CFStringRef class for its lower allocation .
but I hope this answer will satisfy you
found here, also there is more interesting things
http://www.opensource.apple.com/source/clang/clang-318.0.45/src/tools/clang/test/Analysis/NSString.m

I think the question you should be asking is what purpose that code serves.
Note that sizeof doesn't return the number of bytes in stringA, it simply returns the size of the pointer that is stringA. Who knows what lives in that little block of memory that has been allocated to stringB. Maybe it's a string, maybe not. Life is full of mystery.

IOS: Release for NSString is not working as expected

I found a strange behavior with NSString. I tried to run the below code and noticed this.
NSString *str = [[NSString alloc] initwithstring : #"hello"];
[str release];
NSLog(#" Print the value : %#", str);
Here, in the third line app should crash because we are accessing an object which is released. But it is printing the value of str. It is not crashing. But with NSArray i observed different behavior.
NSArray *array = [[NSArray alloc] initwithobjects : #"1", #"2", nil];
[array release];
NSLog(#"Print : %#", [array objectatindex : 0]);
NSLog(#"Print : %#", [array objectatindex : 0]);
The code has two NSLog statements used for NSArray. Here after releasing when the first NSLog is executed, it is printing value. But when second NSLog is executed, app crashes. App crash is acceptable because the array accessed was released already. But it should crash when the first NSLog is executed. Not the second one.
Help me with this behaviors. How release works in these cases.
Thanks
Jithen

The first example doesn't crash because string literals are never released. The code is really:
NSString *str = #"hello";
[str release];
People get burned with string literals on memory management and mistakenly using == to compare them instead of isEqualToString:. The compiler does some optimizations that lead to misleading results.
Update:
The following code proves my point:
NSString *literal = #"foo";
NSString *second = [NSString stringWithString:literal];
NSString *third = [NSString stringWithString:#"foo"]; // <-- this gives a compiler warning for being redundant
NSLog(#"literal = %p", literal);
NSLog(#"second = %p", second);
NSLog(#"third = %p", third);
This code gives the following output:
2013-02-28 22:03:35.663 SelCast[85617:11303] literal = 0x359c
2013-02-28 22:03:35.666 SelCast[85617:11303] second = 0x359c
2013-02-28 22:03:35.668 SelCast[85617:11303] third = 0x359c
Notice that all three variable point to the same memory.

Your second example crashes at the second NSLog because at the first log, the memory where array was hasn't been re-used, but that first log causes enough activity on the heap to cause the memory to become used by something else. Then, when you try to access it again, you get a crash.
Whenever an object is deallocated and its memory marked as free, there is going to be some period of time where that memory still stores what's left of that object. During this time you can still call methods on such objects and so forth, without crashing. This time is extremely short, and if you're running a lot of threads it may not even be enough to get your method call in. So clearly, don't rely on this implementation detail for any behavior.
As others have said, regarding your first question, NSString literals aren't going to be deallocated. This is true for some other Foundation classes (NSNumber comes to mind) but is an implementation detail as well. If you need to do experiments on memory management, use an NSObject instance instead, as it will not show the unusual behaviors.

When you send a release message on an object, the object is actually not being removed from the memory. The release message simply decrements the reference count by one only. If the reference count is zero the object is marked as free. Then the system remove it from the memory. Until this deallocation happens you can access your object. Even if you release the object your object pointer still points to the object unless you are assigning nil to the pointer.

The first example doesn't crash because string literals are never released. Where the second totally depends on release and retain counter.
Read this article. Its contains short-and-sweet explanation for your query
You Should read this apple guideline

You seem to assume that release should destroy the object immediately. I don't think that's the guarantee that the language makes. What release means is: I have finished using this object and I promise not to use it again. From that point onwards it's up to the system to decide when to actually deallocate the memory.
Any behaviour you see beyond that is not defined and may change from one version of the Objective C runtime to the next.
That's to say that the other answers that suggest the difference is string literals and re-use of memory are currently correct but assuming that the behaviour will always be like this would likely be a mistake.

why not EXC_BAD_ACCESS?

I've written the following code:
NSString *string = [[NSString alloc] initWithFormat:#"test"];
[string release];
NSLog(#"string lenght = %d", [string length]);
//Why I don't get EXC_BAD_ACCESS at this point?
I should, it should be released. The retainCount should be 0 after last release, so why is it not?
P.S.
I am using latest XCode.
Update:
NSString *string = [[NSString alloc] initWithFormat:#"test"];
NSLog(#"retainCount before = %d", [string retainCount]);// => 1
[string release];
NSLog(#"retainCount after = %d", [string retainCount]);// => 1 Why!?

In this case, the frameworks are likely returning the literal #"test" from NSString *string = [[NSString alloc] initWithFormat:#"test"];. That is, it determines the literal may be reused, and reuses it in this context. After all, the input matches the output.
However, you should not rely on these internal optimizations in your programs -- just stick with the reference counting rules and well-defined behavior.
Update
David's comment caused me to look into this. On the system I tested, NSString *string = [[NSString alloc] initWithFormat:#"test"]; returns a new object. Your program messages an object which should have been released, and is not eligible for the immortal string status.
Your program still falls into undefined territory, and happens to appear to give the correct results in some cases only as an artifact of implementation details -- or just purely coincidence. As David pointed out, adding 'stuff' between the release and the log can cause string to really be destroyed and potentially reused. If you really want to know why this all works, you could read the objc runtime sources or crawl through the runtime's assembly as it executes. Some of it may have an explanation (runtime implementation details), and some of it is purely coincidence.

Doing things to a released object is an undefined behavior. Meaning - sometimes you get away with it, sometimes it crashes, sometimes it crashes a minute later in a completely different spot, sometimes a variable ten files away gets mysteriously modified.
To catch those issues, use the NSZombie technique. Look it up. That, and some coding discipline.
This time, you got away because the freed up memory hasn't been overwritten by anything yet. The memory that string points at still contains the bytes of a string object with the right length. Some time later, something else will be there, or the memory address won't be valid anymore. And there's no telling when this happens.
Sending messages to nil objects is, however, legitimate. That's a defined behavior in Objective C, in fact - nothing happens, 0 or nil is returned.

Update:
Ok. I'm tired and didn't read your question carefully enough.
The reason you are not crashing is pure luck. At first I though that you were using initWithString: in which case all the answers (including my original one (below)) about string literals would be valid.
What I mean by "pure luck"
The reason this works is just that the object is released but your pointer still points to where it used to be and the memory is not overwritten before you read it again. So when you access the variable you read from the untouched memory which means that you get a valid object back. Doing the above is VERY dangerous and will eventually cause a crash in the future!
If you start creating more object in between the release and the log then there is a chance that one of them will use the same memory as your string had and then you would crash when trying to read the old memory.
It is even so fragile that calling log twice in a row will cause a crash.
Original answer:
String literals never get released!
Take a look at my answer for this question for a description of why this is.
This answer also has a good explanation.

One possible explanation: You're superfluously dynamically allocating a string instead of just using the constant. Probably Cocoa already knows that's just a waste of memory (if you're not creating a mutable string), so it maybe releases the allocated object and returns the constant string instead. And on a constant string, release and retain have no effect.
To prove this, it's worth comparing the returned pointer to the constant string itself:
int main()
{
NSString *s = #"Hello World!";
NSString *t = [[NSString alloc] initWithFormat:s];
if (s == t)
NSLog(#"Strings are the same");
else
NSLog(#"Not the same; another instance was allocated");
return 0;
}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Proper use of CFStringTokenizerCreate with ARC? - ios

My gut tells me that it should be __bridge_transfer instead of bridge since you are calling create (unless there is a CFRelease call later). I also think the locale needs to be released since it is a copy. EDIT Oops ignore me, I read it wrong (was using a phone)

For any Swift users reading this thread: the CFRelease() function does not seem to have been carried on into the Swift language, as Core Foundation objects are automatically memory managed (according to a compiler warning I'm seeing in Swift 3.0.2), so that's one less thing to think about.

Related

Want Autorelease lifetime with ARC

What is Apple warning against in ARC documentation caveat about pass by reference?

iOS: Error and Crash by NSString with C malloc

IOS: Release for NSString is not working as expected

why not EXC_BAD_ACCESS?

Categories

Resources