I've just come across this line in some legacy code I'm editing:
[UIImage imageNamed:#"data/visuals/interface/" #"backgroundViewController"];
^^^^
"Oops, what have I done here?"
I thought I must have accidentally just pasted something in the wrong place, but an undo didn't change that line. Out of curiosity, I built the program and it was successful!
Whaddyaknow? Obj-c has a more succinct way of concatenating string literals.
I added some more tests:
A simple log
NSLog(#"data/visuals/interface/" #"backgroundViewController");
data/visuals/interface/backgroundViewController
In parameters
NSURL *url = [NSURL URLWithString:#"http://" #"test.com" #"/path"];
NSLog(#"URL:%#", url);
URL:http://test.com/path
Using Variables
NSString *s = #"string1";
NSString *s2 = #"string2";
NSLog(#"%#", s s2);
Doesn't compile (not surprised by this one)
Other literals
NSNumber *number = #1 #2;
Doesn't compile
Some questions
Is this string concatenation documented anywhere?
How long has it been supported?
What is the underlying implementation? I expect it will be [s1 stringByAppendingString:s2]
Is it considered good practice by any authoritative body?
This method of concatenating static NSStrings is a compile-time compiler capability that has been available for over ten years. It is usually used to allow long constant strings to be split over several lines. Similar capabilities have been available in "C" for decades.
In the C Programming Language book, 1988 second edition, page 38 describes string concatenation so it has been around for a long time.
Excerpt from the book:
String constants can be concatenated at compile time:
"hello," " world" is equivalent to "hello, world"
This is useful for spitting long strings across several source lines.
Objective-C is a strict superset of "C" so it has always supported "C" string concatenation and my guess is that because of that static NSString concatenation has always been available.
It is considered good practice when used to split a static string across several lines for readability.
Related
While experimenting with the zig syntax, I noticed the type expression of string literals is omitted in all examples. Which is totally fine, I'm not saying it shouldn't be.
const zig_string = "I am a string"; //it looks nice enough for sure and compiles fine ofcourse
However, because this type omission is a bit inconsistent* with other type declarations in zig, it can lead to beginners (like me) misinterpreting the actual type of string literals (which is fact quite rightfully complicated and 'different'). Anyway, after reading about the type of string literals being 'pointers to (utf-8 encoded) immutable (const), sentinel terminated arrays of u8 bytes' (yes?), with next to the hard coded length field, a terminator field like so: [<length>:0]. To check my own understanding, I thought it reasonable to try adding this type expression to the declaration, similar to how other arrays are conveniently declared, so with an underscore to infer the length, because who likes counting characters?
const string: *const [_:0]u8 = "jolly good"; //doesn't compile: unable to infer array size
But it didn't compile :(.
After dutifully counting characters and now specifying the length of my string however, it proudly compiled :)!
const string: *const [10:0]u8 = "jolly good"; //happily compiles
Which led me to my question:
Why is this length specification needed for string literals and not for other literals/arrays? - (And should this be so?)
Please correct my type description of string literals if I missed an important nuance.
I'd like to know to further deepen my understanding of the way strings are handled in zig.
*although there are more cases where the zig compiler can infer the type without it
Types never have _ in them.
"jolly good" is a string literal. *const [10:0]u8 is the type.
For "other literals/arrays":
const a = [_]u8{ 1, 2, 3 };
[_]u8{ 1, 2, 3 } is an array literal. The type is [3]u8 and it cannot be specified as [_]u8.
Look into slices. They offer a very convenient way to use strings and arrays.
I want to create a string. Based on condition I have to append some extra string. In this case which is preferred to use? Whether creating two different strings based on condition or creating mutable string to append a new string based on condition ?
EX
if(a==1)
{
String = "apple seed"
}
else
{
String = "apple"
}
Or
NSMutableString *string ;
string = #"apple";
if( a==1)
{
[string appendString:#"seed"]
}
A string literal, like your #"apple", is a compile-time constant so assigning a string literal to a variable of type NSString * is a cheap operation.
So for your particular examples the first selects one of two simple assignments, while the second can do a simple assignment and a method call - which is clearly going to take a little more time.
That said a "little more" on a modern computer is not long. Beware of optimising prematurely; it is far better to write code that is clear and understandable to you first and concern yourself over performance of minutiae later if needed (this is not an excuse to write poor algorithms or intentionally bad code of course).
HTH
I have been having a lot of trouble with NSString's stringWithFormat: method as of late. I have written an object that allows you to align N lines of text (separated by new lines) either centered, right, or left. At the core of my logic is NSString's stringWithFormat. I use this function to pad my strings with spaces on the left or right of individual lines to produce the alignment I want. Here is an example:
NSString *str = #"$3.00" --> 3 dollars
[NSString stringWithFormat:#"%8s", [str cStringUsingEncoding:NSUnicodeStringEncoding]] --> returns --> " $3.00"
As you can see the above example works great, I padded 3 spaces on the left and the resulting text is right aligned/justified. Problems begin to arise when I start to pass in foreign currency symbols, the formatting just straight up does not work. It either adds extra symbols or just returns garbage.
NSString *str = #"Kč1.00" --> 3 Czech Koruna (Czech republic's currency)
[NSString stringWithFormat:#"%8s", [str cStringUsingEncoding:NSUnicodeStringEncoding]] --> returns --> " Kč1.00"
The above is just flat out wrong... Now I am not a string encoding expert but I do know NSString uses the international standardized unicode encoding for special characters well outside basic ASCII domain.
How can I fix my problem? What encoding should I use? I have tried so many different encoding enums I have lost count, everything from NSMACOSRomanEncoding to NSUTF32UnicodeBigEndian.. My last resort will be to just completely ditch using stringWithFormat all together, maybe it was only meant for simple UTF8Strings and basic symbols.
If you want to represent currency, is a lot better if you use a NSNumberFormatter with currency style (NSNumberFormatterCurrencyStyle). It reads the currentLocale and shows the currency based on it. You just need to ask its string representation and append to a string.
It will be a lot easier than managing unicode formats, check a tutorial here
This will give you the required result
NSString *str = #"Kč1.00";
str=[NSString stringWithFormat:#"%#%8#",[#" " stringByPaddingToLength:3 withString:#" " startingAtIndex:0],str];
Out Put : #" Kč1.00";
Just one more trick to achieve this -
If you like use it :)
[NSString stringWithFormat:#"%8s%#",[#"" cStringUsingEncoding:NSUTF8StringEncoding],str];
This will work too.
I have some NSString like :
test = #"this is %25test%25 string";
I am trying to replace test with some arabic text , but it is not replacing exactly as it is :
[test stringByReplacingOccurrencesOfString:#"test" withString:#"اختبار"];
and the result is :
this is %25 اختبار %25 string
Some where I read there could be some problem with encoding or text alignment.Is there extra adjustment needed to be done for arabic string operations .
EDIT : I have used NSMutable string insert property but still the same result .
EDIT 2:
One other thing that occurs to me that is causing most of your trouble in this specific example. You have a partially percent-encoded string above. You have spaces, but you also have %25. You should avoid doing that. Either percent-encode a string or don't. Convert it all at once when required (using stringByAddingPercentEscapesUsingEncoding:). Don't try to "hard-code" percent-encoding. If you just used "this is a %اختبار% string" (and then percent-encoded the entire thing at the end), all your directional problems would go away (see how that renders just fine?). The rest of these answers address the more general question when you really need to deal with directionality.
EDIT:
The original answer after the line relates to human-readable strings, and is correct for human-readable strings, but your actual question (based on your followups) is about URLs. URLs are not human-readable strings, even if they occasionally look like them. They are a sequence of bytes that are independent of how they are rendered to humans. "اختبار" cannot be in the path or fragment parts of an URL. These characters are not part of the legal set of characters for those sections (اختبار is allowed to be part of the host, but you have to follow the IDN rules for that).
The correct URL encoding for this is a %25<arabic>%25 string is:
this%20is%20a%20%2525%D8%A7%D8%AE%D8%AA%D8%A8%D8%A7%D8%B1%2525%20string
If you decode and render this string to the screen, it will appear like this:
this is a %25اختبار%25 string
But it is in fact exactly the string you mean (and it is the string you should pass to the browser). Follow the bytes (like the computer will):
this - this (ALPHA)
%20 - <space> (encoded)
is - is (ALPHA)
%20 - <space> (encoded)
a - a (ALPHA)
%20 - <space> (encoded)
%25 - % (encoded)
25 - 25 (DIGIT)
%D8%A7 - ا (encoded)
%D8%AE - خ (encoded)
%D8%AA - ت (encoded)
%D8%A8 - ب (encoded)
%D8%A7 - ا (encoded)
%D8%B1 - ر (encoded)
%25 - % (encoded)
25 - 25 (DIGIT)
%20 - <space> (encoded)
string - string (ALPHA)
The Unicode BIDI display algorithm is doing what it means to do; it just isn't what you expect. But those are the bytes and they're in the correct order. If you add any additional bytes (such as LRO) to this string, then you are modifying the URL and it means something different.
So the question you need to answer is, are you making an URL, or are you making a human-readable string? If you're making an URL, it should be URL-encoded, in which case you will not have this display problem (unless this is part of the host, which is a different set of rules, but I don't believe that's your problem). If this is a human-readable string, see below about how to provide hints and overrides to the BIDI algorithm.
It's possible that you really need both (a human-friendly string, and a correct URL that can be pasted). That's fine, you just need to handle the clipboard yourself. Show the string, but when the user goes to copy it, replace it with the fully encoded URL using UIPasteboard or by overriding copy:. See Copy, Cut, and Paste Operations. This is fairly common (note how in Safari, it displays just "stackoverflow.com" in the address bar but if you copy and paste it, it pastes "https://stackoverflow.com/" Same thing.
Original answer related to human-readable strings.
Believe it or not, stringByReplacingOccuranceOfString: is doing the right thing. It's just not displaying the way you expect. If you walk through characterAtIndex:, you'll find that it's:
% 2 5 ا ...
The problem is that the layout engine gets very confused around all the "neutral direction" characters. The engine doesn't understand whether you meant "%25" to be attached to the left to right part or right to left part. You have to help it out here by giving it some explicit directional characters to work with.
There are a few ways to go about this. First, you can do it the Unicode 6.3 tr9-29 way with Explicit Directional Isolates. This is exactly the kind of problem that Isolates are meant to solve. You have some piece of text whose direction you want to be considered completely independently of all other text. Unicode 6.3 isn't actually supported by iOS or OS X as best I can tell, but for many (though not all) uses, it "works."
You want to surround your Arabic with FSI (FIRST STRONG ISOLATE U+2068) and PDI (POP DIRECTIONAL ISOLATE U+2069). You could also use RLI (RIGHT-TO-LEFT ISOLATE) to be explicit. FSI means "treat this text as being in the direction of the first strong character you find."
So you could ideally do this:
NSString *test = #"this is a %25\u2068test\u2069%25 string";
NSString *arabic = #"اختبار";
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:arabic];
That works if you know what you're going to substitute before hand (so you know where to put the FSI and PDI). If you don't, you can do it the other way and make it part of the substitution:
NSString * const FSI = #"\u2068";
NSString * const PDI = #"\u2069";
NSString *test = #"this is %25test%25 string";
NSString *arabic = #"اختبار";
NSString *replaceString = [#[FSI, arabic, PDI] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
I said this "mostly" works. It's fine for UILabel, and it probably is fine for anything using Core Text. But in NSLog output, you'll get these extra "placeholder" characters:
You might get this other places, too. I haven't checked UIWebView for instance.
So there are some other options. You can use directional marks. It's a little awkward, though. LRM and RLM are zero-width strongly directional characters. So you can bracket the arabic with LRM (left to right mark) so that the arabic doesn't disturb the surrounding text. This is a little ugly since it means the substitution has to be aware of what it's substituting into (which is why isolates were invented).
NSString * const LRM = #"\u200e";
NSString *test = #"this is a %25test%25 string";
NSString *replaceString = [#[LRM, arabic, LRM] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
BTW, Directional Marks are usually the right answer. They should always be the first thing you try. This particular problem is just a little too tricky.
One more way is to use Explicit Directional Overrides. These are the giant "do what I tell you to do" hammer of the Unicode world. You should avoid them whenever possible. There are some security concerns with them that make them forbidden in certain places (<RLO>elgoog<PDF>.com would display as google.com for instance). But they will work here.
You bracket the whole string with LRO/PDF to force it to be left-to-right. You then bracket the substitution with RLO/PDF to force it to the right-to-left. Again, this is a last resort, but it lets you take complete control over the layout:
NSString * const LRO = #"\u202d";
NSString * const RLO = #"\u202e";
NSString * const PDF = #"\u202c";
NSString *test = [#[LRO, #"this is a %25test%25 string", PDF] componentsJoinedByString:#""];
NSString *arabic = #"اختبار";
NSString *replaceString = [#[RLO, arabic, PDF] componentsJoinedByString:#""];
NSString *result = [test stringByReplacingOccurrencesOfString:#"test" withString:replaceString];
I would think you could solve this problem with the Explicit Directional Embedding characters, but I haven't really found a way to do it without at least one override (for instance, you could use RLE instead of RLO above, but you still need the LRO).
Those should give you the tools you need to figure all of this out. See the Unicode TR9 for the gory details. And if you want a deeper introduction to the problem and solutions, see Cal Henderson's excellent Understanding Bidirectional (BIDI) Text in Unicode.
You should try like this:
NSString *test = #"this is %25test%25 string";
NSString *test2 = [[[test stringByReplacingPercentEscapesUsingEncoding:NSStringEncodingConversionAllowLossy] componentsSeparatedByString:#"test"] componentsJoinedByString:#"اختبار"];
I have an issue in an application I'm writing where I need to compare one NSURL that points to a file and an NSString, which is an incoming string representation of the same file path.
I can't get them to compare – the output I'm given when NSLogging is confusing, perhaps it is a encoding issue?
I can make them look the same with this code: [urlString stringByRemovingPercentEncoding];
The raw output for the NSURL is:
file:///var/mobile/Applications/F14AFBD8-FF60-4094-8BBD-7AC2477E0B20/Documents/1.%20AKTIV%20SA%CC%88LJFOLDER/Sa%CC%88ljfolder2014-SP1.pdf
And for the NSString:
/var/mobile/Applications/F14AFBD8-FF60-4094-8BBD-7AC2477E0B20/Documents/1. AKTIV SÄLJFOLDER/Säljfolder2014-SP1.pdf
If I run stringByRemovingPercentEncoding on the NSURL it looks the same, but they don't compare.
If I run stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding to the NSString I get file:///var/mobile/Applications/F14AFBD8-FF60-4094-8BBD-7AC2477E0B20/Documents/nestle/1.%20AKTIV%20S%C3%84LJFOLDER/S%C3%A4ljfolder2014-SP1.pdf
Note that the percentages is not the same on the urls. I have tried so many things, changing encodings etc. but can't find a way to solve this.
Edit
So, I tried the precomposedStringWithCanonicalMapping as follows:
NSLog(#"EQUAL? :%hhd", [[strippedUrlString precomposedStringWithCanonicalMapping] isEqualToString:[filePath precomposedStringWithCanonicalMapping]]); – returns 0
I logged the strings and got
/Users/xxxxxx/Library/Application Support/iPhone Simulator/7.0/Applications/C05E0885-7B58-4B2F-A6B4-D9388E60462C/Documents/1. AKTIV SÄLJFOLDER/Säljfolder2014-SP1.pdf
with NSLog(#"Precompose url 1: %#", [strippedUrlString precomposedStringWithCanonicalMapping]);
for the first string and
/Users/xxxxxx/Library/Application%20Support/iPhone%20Simulator/7.0/Applications/C05E0885-7B58-4B2F-A6B4-D9388E60462C/Documents/1.%20AKTIV%20SA%CC%88LJFOLDER/Sa%CC%88ljfolder2014-SP1.pdf
with NSLog(#"Precompose file 1: %#", [filePath precomposedStringWithCanonicalMapping]);
for the second.
Tried same code, but with precomposedStringWithCompatibilityMapping and got exactly the same result :(
Probably you ran in a problem that in Unicode equivalent strings are not always binary equal.
http://en.wikipedia.org/wiki/Unicode_equivalence
You have
…SA%CC%88…:
This is the problem.
It means: We have an "A" and a combining diaeresis -> Ä. The diaeresis is the 0xCC88, which is UTF-8 for Unicode 0x0308 (COMBINING DIAERESIS). So the Ä is encoded as an A with an combining diaeresis.
…S%C3%84…:
This is easy. 0xC384 is UTF-8 for 0x00C4 that means A-Umlaut -> Ä
First of all: What is the source of the first string?
Addition: You can use precomposedStringWith…Mapping (NSString).
BTW: You can compare strings without diacritic marks using -compare:withOptions: et al. with the option NSDiacriticInsensitiveSearch. In this case, I assume, string 1 equals string 2. Butt it would equal an "A", too, what is probably not what you want.