Peculiar issue with wiping the underlying buffer of NSString - ios

I am using a technique outlined in the book Hacking and Securing iOS Applications (relevant section here) to wipe the underlying buffer of a NSString as shown below.
NSString *s = [NSString stringWithFormat:#"Hello"];
unsigned char *text = (unsigned char*)CFStringGetCStringPtr((CFStringRef)s, CFStringGetSystemEncoding());
if (text != NULL)
{
memset(text, 0, [s length]);
}
This works, unless the string is a certain value.
// The following crashes with EXC_ACCESS_ERROR on memset
NSString *s = [NSString stringWithFormat:#"No"];
NSString *s = [NSString stringWithFormat:#"Yes"];
// These work fine though
NSString *s = [NSString stringWithFormat:#"Hello"];
NSString *s = [NSString stringWithFormat:#"Do"];
NSString *s = [#"N" stringByAppendingString:#"o"];
It looks like certain strings are not created on the heap but is optimized by making it point to a read-only string table even if the string is created on the heap.

Indeed constant string are not created on the heap and are in read-only memory. This includes a few that look like runtime but are made compile-time constants, your examples are such statements.
With this statement fragment there is no reason not to make it a compile-time constant.
[NSString stringWithFormat:#"No"]
it is equivalent to:
#"No"
Suggestion, file a bug report requesting a secure string class, I have. Several have been filed and I have been told that if there are enough (whatever that amount might be) are files it will implement it.
It is possible to subclass NSString, not easy but you will have compete control of the actual buffer and it should not be subject to possible failure due to Apple changing the implementation detail. I have done that successfully.

NSString *s = [NSString stringWithFormat:#"No"]; will be optimised to NSString *s = #"No"; because there are no substitutions in the format. Assigning a literal will give you a pointer to the readonly text segment of the loaded binary.
NSString *s = [#"N" stringByAppendingString:#"o"]; will create a new string on the heap and return a reference to it. The heap is read-write even though the datatype NSString is readonly.
When you get the CString pointer to the underlying data it's pointing either into read-only data in the first case, or read-write data in the second. The memset will fail on the readonly memory, but succeed on the read-write.

Related

What exactly does #"" do behind the scenes?

When giving a NSString a value using = #"", what exactly is that shorthand for? I noticed that, all other things being equal in an application, global NSStrings made with #"" don't need retain to be used outside of the methods that gave them that value, whereas NSStrings given a value in any other way do.
I have found this question utterly unsearchable, so I thank you for your time.
A related answer: Difference between NSString literals
In short, when writing #"" in the global scope of your program, it does the same as writing "": The literal string you specified will be embedded in your binary at compile time, and can thus not be deallocated during runtime. This is why you don't need release or retain.
The # just tells the compiler to construct a NSString object from the C string literal. Mind you that this construction is very cheap and probably heavily optimized. You can follow the link and see this example:
NSString *str = #"My String";
NSLog(#"%# (%p)", str, str);
NSString *str2 = [[NSString alloc] initWithString:#"My String"];
NSLog(#"%# (%p)", str2, str2);
Producing this output:
2011-11-07 07:11:26.172 Craplet[5433:707] My String (0x100002268)
2011-11-07 07:11:26.174 Craplet[5433:707] My String (0x100002268)
Note the same memory addresses
EDIT:
I made some test myself, see this code:
static NSString *str0 = #"My String";
int main(int argc, const char * argv[]) {
NSLog(#"%# (%p)", str0, str0);
NSString *str = #"My String";
NSLog(#"%# (%p)", str, str);
NSString *str2 = [[NSString alloc] initWithString:#"My String"];
NSLog(#"%# (%p)", str2, str2);
return 0;
}
Will produce this output:
2015-12-13 21:20:00.771 Test[6064:1195176] My String (0x100001030)
2015-12-13 21:20:00.772 Test[6064:1195176] My String (0x100001030)
2015-12-13 21:20:00.772 Test[6064:1195176] My String (0x100001030)
Also, when using the debugger you can see that the actual object being created when using literals are in fact __NSCFConstantString objects.
I think the related concept to that is called Class Clusters
Objective-C is a super-set of C, which means you should be able to write C code in there and have it work. As a result, the compiler needs a way to distinguish between a C string (an old-school series of bytes) and an Objective-C NSString (Unicode support, etc). This is done using the # symbol, such as #"Hello".
#"Hello" can't be released because it's a literal string – it's been written into the program when it was built, versus being assigned at run-time.
String allocated using the literal syntax i.e # reside as c strings in the data segment of the executable. They are allocated only once at the time of program launch and are never released until the program quits.
As an exercise you can try this:
NSString *str1 = #"Hello";
NSString *str2 = #"Hello";
NSLog("Memory Address of str1 : %p", str1);
NSLog("Memory Address of str2 : %p", str2);
both the log statements will print the same address which means literals are constant strings with lifetimes same as that of the program.

Need assistance regarding NSString

In NSString NSString Class Reference what this means
Distributed objects:
Over distributed-object connections, mutable string objects are passed by-reference and immutable string objects are passed by-copy.
And NSString can't be changed, so what happening when I am changing str in this code
NSString *str = #"";
for (int i=0; i<1000; i++) {
str = [str stringByAppendingFormat:#"%d", i];
}
will I get memory leak? Or what?
What your code is doing:
NSString *str = #""; // pointer str points at memory address 123 for example
for (int i=0; i<1000; i++) {
// now you don't change the value to which the pointer str points
// instead you create a new string located at address, lets say, 900 and let the pointer str know to point at address 900 instead of 123
str = [str stringByAppendingFormat:#"%d", i]; // this method creates a new string and returns a pointer to the new string!
// you can't do this because str is immutable
// [str appendString:#"mmmm"];
}
Mutable means you can change the NSString. For example with appendString.
pass by copy means that you get a copy of NSString and you can do whatever you want; it does not change the original NSString
- (void)magic:(NSString *)string
{
string = #"LOL";
NSLog(#"%#", string);
}
// somewhere in your code
NSString *s = #"Hello";
NSLog(#"%#", s); // prints hello
[self magic:s]; // prints LOL
NSLog(#"%#", s); // prints hello not lol
But imagine you get a mutable NSString.
- (void)magic2:(NSMutableString *)string
{
[string appendString:#".COM"];
}
// somewhere in your code
NSString *s = #"Hello";
NSMutableString *m = [s mutableCopy];
NSLog(#"%#", m); // prints hello
[self magic2:m];
NSLog(#"%#", m); // prints hello.COM
Because you pass a reference you can actually change the "value" of your string object since you are working with the original version and not a duplicate.
NOTE
String literals live as long as your app lives. In your exmaple it means that your NSString *str = #""; never gets deallocated. So in the end after you have looped through your for loop there are two string objects living in your memory. Its #"" which you cannot access anymore since you have no pointer to it but it is still there! And your new string str=123456....1000; But this is not a memory leak.
more information
No, you will not get memory leak with your code, as you are not retaining those objects in the loop, they're created with convenience method, you don't own them, and they will be released on next cycle of autorelease pool. And, it's doesn't matter if you are using ARC or not, objects created with convenience methods and not retained are released wherever they are out of their context.
In will not leak memory, but will get more memory allocation, due to making new copy of immutable copy as many time loop triggers [str stringByAppendingFormat:#"%d", i];.
Memory leak will get performed when, you put your data unreferenced, or orphan, this will not make your last copy of string orphan every time when loops, but will clear all copies of NSString when operation get complete, or viewDidUnload.
You will not get a memory leak in the example code because Automatic Reference Counting will detect the assignment to str and (automatically) release the old str.
But it would be much better coding style (and almost certainly better performance) to do this:
NSMutableString* mstr = [NSMutableString new];
for(int i = 0; i < 1000; ++i){
[mstr appendFormat:#"%d",i];
}
NSString* str = mstr;
...
As to the first question, I think it means that a change made to a mutable string by a remote process will be reflected in the originating process's object.

Multiple NSString declaration and initialization

I am new to the ios development.I was checking the NSString and where I had found out that it was excessed using the multiple ways which is as under
1) NSString *theString = #"hello";
2) NSString *string1 = [NSString stringWithString:#"This is"];
3) NSString *string =[[NSString alloc]init];
string =#"Hello";
Now I am confused with above three and would like to know the differences between then?
Yes all three are ways to create string...I try to explain you one by one.
One think you must remember that in Objective-c string is represented by #"".
whatever comes double quotes is string eg #"Hello" is string. #"" is basically literal to create NSString.
NSString *theString = #"hello";
EDIT:
In this statement we are creating an NSString constant. Objective-C string constant is created at compile time and exists throughout your program’s execution.
2. NSString *string1 = [NSString stringWithString:#"This is"];
In this statement again, we are creating an autorelease object of NSString, but here a slide diff. than the first case. here we are using another NSString object to create a new autorelease object of NSString. So generally we use stringWithString method when we already have NSString Object and we want another NSString object with similar content.
3. NSString *string =[[NSString alloc]init];
string =#"Hello";
Here, In first statement you are creating a NSString Object that you owned and it is your responsibility to release it (In non-ARC code), once you done with this object.
Second statement is similar to case 1, by string literal you are creating string Object.
but when you put these two statements together it causes you memory-leak(In non-ARC code), because in statement one you are allocating & initiating memory for new string object and right after in second statement you are again assigning a new string object to the same string reference.
1) and 2) is the same. there is no difference. [NSString stringWithString: str] just does nothing and returns str if str is immutable already. via this link
3) you create empty NSString pointer. after you assign this string with ' = ' is same way with 1) and it alloc memory to your NSString. You can watch this NSString pointer in 3) and you see pointer was changed:
NSString *string =[[NSString alloc]init];
printf("\n string_1-> = %p\n", string );
string = #"Hello";
printf("\n string_2-> = %p\n", string );
OK, first off, the syntax #"blah" is an Objective-C literal to create an NSString. Essentially it is an NSString.
Also, there are two parts to each statement. The declaration of the string variable NSString *string and the instantiation string = <something>.
So...
This is how you would do it most of the time. You declare the variable and set it using the objective-c literal.
This adds a bit of redundancy and you wouldn't really use it like this. You'd maybe use it with a substring or something from another string. As it is you are essentially creating two objects here. The first is [NSString stringWithString:<>]; and the second is #"This is".
This is a bit odd. You are first creating an empty string #"" and then you are replacing this string with #"Hello". Don't do this. There is no point. You may as well just do number 1 as the result is the same.
All three will work but in these exact cases you'd be better off using the first.

Converting NSString into uint8_t

I am working on data encryption sample code provided by Apple in the "Certificate, Key and Trust Programming guide". The sample code for encrypting/decrypting data considers an uint8_t. However the real world application would be doing this on an NSString object. I have been trying to convert NSString object to uint8_t but every-time I try I get a compiler warning. Solutions given for 'almost' same problems given in various forums, don't seem to work for me.
Here is an example of turning any string value into a uint8_t*. The easiest way is to just cast the bytes of NSData as and uint8_t*. Other option is to allocate memory and copy the bytes but you will still need to track the length somehow.
NSData *someData = [#"SOME STRING VALUE" dataUsingEncoding:NSUTF8StringEncoding];
const void *bytes = [someData bytes];
int length = [someData length];
//Easy way
uint8_t *crypto_data = (uint8_t*)bytes;
Optional way
//If you plan on using crypto_data as a class variable
// you will need to do a memcpy since the NSData someData
// will get autoreleased
crypto_data = malloc(length);
memcpy(crypto_data, bytes, length);
//work with crypto_data
//free crypto_data most likely in dealloc
free(crypto_data);
NSString *stringToEncrypt = #"SOME STRING VALUE";
uint8_t *cString = (uint8_t *)stringToEncrypt.UTF8String;

Sensitive data: NSString VS NSMutableString (iPhone)

I have some sensitive data I want to clear directly after use. Currently, the sensitive data is in the form of NSString. NSString is in my understanding immutable, meaning that I can't really clear the data. NSMutableString seems more appropriate, though, as it is mutable and has methods like replaceCharactersInRange and deleteCharactersInRange. I have no knowledge of the implementation details so I wonder if NSMutableString would serve my purpose?
I would be afraid NSMutableString would try to optimize and leave the string in memory. If you want more control try allocating your own memory then create an NSString with it. If you do that you can overwrite the memory before you release it.
char* block = malloc(200);
NSString* string = [[NSString alloc] initWithBytesNoCopy:length:encoding:freeWhenDone];
//use string
memset(block, 0, 200);// overwrite block with 0
[string release];
free(block);
You need to wipe the c pointer with zeros with a memset function however a memset call can be optimized out by the compiler, see What is the correct way to clear sensitive data from memory in iOS?
So the code could be something like this:
NSString *string = #"hi";
unsigned char *stringChars = (unsigned char *)CFStringGetCStringPtr((CFStringRef)string, CFStringGetSystemEncoding());
safeMemset(stringChars, 0, [string length]);
But be careful clearing the underlying c pointer of an NSString. On a device for example, if the string contains the word "password", the underlying c pointer just reuses or points to the same address as used by the system and you will crash by trying to wipe this area of memory.
To be safe you may want to use a char array, not the char pointer, to store your sensitive strings and wipe them after without ever putting it into an NSString object.
If an attacker can read the contents of memory, you are beyond hosed.
-release the string and be done with it. There's no way to know if you've deleted any possible copies of the string in various caches (such as if you draw it to screen, etc).
You probably have much more significant security issues to worry about.
As of iOS9, inner pointer of NSString obtained from the snippet below has become read-only and generates bad access when trying to set the bytes.
unsigned char *stringChars = (unsigned char *)CFStringGetCStringPtr((CFStringRef)string, CFStringGetSystemEncoding());
It is possible with NSMutableString but then if you have another NSString source, say from a textfield, that source will still be in memory and you're still out of luck.
If you are creating a new NSString, The best way is implement your own String class with underlying byte array. Provide a method to create NSString copies using the underlying byte array as the inner pointer.:
-(NSString *)string
{
return [[NSString alloc] initWithBytesNoCopy:_buff length:_length encoding:NSUTF8StringEncoding freeWhenDone:NO];
}
// Will prematurely wipe data and all its copies when called
- (void)clear
{
// Volatile keyword disables compiler's optimization
volatile unsigned char *t = (unsigned char *)_buff;
int len = _length;
while (len--) {
*t++ = 0;
}
}
// In case you forget to clear, it will cleared on dealloc
- (void)dealloc
{
[self clear];
free(_buff);
}

Resources