Apple's documentation states the followings about the value of a key.
Default value strings may contain extended ASCII characters.
...
Just as in C, some characters must be prefixed with a backslash before you can include them in the string. These characters include double quotation marks, the backslash character itself, and special control characters such as linefeed (\n) and carriage returns (\r).
In my experience it did't matter if I used ' or \', after calling the localizedStringForKey:value:table: method the result was the same.
question: why?
question: is there an explicit list somewhere which lists all the characters which must be escaped and which can be (meaning that the result will be the same)?
Related
The JSONPointer notation (rfc6901) allows you to denote a location in a JSON document as a string.
I was surprised to see that the specification uses a tilde '~' as the escape character?
Why was this chosen rather than something more conventioal like a backslash ''?
The reason backslash cannot be used is that backslash already has a meaning in JSON and it is desirable to be able to include a JSONPointer in a JSON document without having to double escape it.
If you read the specification carefully you will note:
JSON String Representation
A JSON Pointer can be represented in a JSON string value. Per
[RFC4627], Section 2.5, all instances of quotation mark '"' (%x22),
reverse solidus '\' (%x5C), and control (%x00-1F) characters MUST be
escaped.
Note that before processing a JSON string as a JSON Pointer,
backslash escape sequences must be unescaped.
Another reason is to allow for URI encoding.
According to this discussion it was almost caret '^' instead.
Note also that tilde '~' is allowed in URLs whereas caret '^' is not.
Though see http://jkorpela.fi/tilde.html for a counterpoint to tildes in URLs.
In Lua (I can only find examples in other languages), how do I remove all punctuation, special characters and whitespace from a string? So, for example, s t!r#i%p^(p,e"d would become stripped?
In Lua patterns, the character class %p represents all punctuation characters, the character class %c represents all control characters, and the character class %s represents all whitespace characters. So you can represent all punctuation characters, all control characters, and all whitespace characters with the set [%p%c%s].
To remove these characters from a string, you can use string.gsub. For a string str, the code would be the following:
str = str:gsub('[%p%c%s]', '')
(Note that this is essentially the same as Egor's code snippet above.)
If you remove all special chars, whitespace, … all that's left is letters and numbers, right? So if str is your string,
str:gsub( "%W", "" )
will be the cleaned string.
%w matches all word characters, upper-case it %W to match all non-word characters. If that's not exactly what you want to match, you can build your own character class -- e.g. if you wanted to include _ as an acceptable character, you could use [^%w_].
This works for me
m=your_string:gsub('%W','')
I have following regex handy to match all the lines containing console.log() or alert() function in any javascript file opened in the editor supporting PCRE.
^.*\b(console\.log|alert)\b.*$
But I encounter many files containing window.alert() lines for alerting important messages, I don't want to remove/replace them.
So the question how to regex-match (single line regex without need to run frequently) all the lines containing console.log() and alert() but not containing word window. Also how to escape round brackets(parenthesis) which are unescapable by \, to make them part of string literal ?
I tried following regex but in vain:
^.*\b(console\.log|alert)((?!window).)*\b.*$
You should use a negative lookhead, like this:
^(?!.*window\.).*\b(console\.log|alert)\b.*$
The negative lookhead will assert that it is impossible to match if the string window. is present.
Regex Demo
As for the parenthesis, you can escape them with backslashes, but because you have a word boundary character, it will not match if you put the escaped parenthesis, because they are not word characters.
The metacharacter \b is an anchor like the caret and the dollar sign.
It matches at a position that is called a "word boundary". This match
is zero-length.
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a
word character.
After the last character in the string, if the last
character is a word character.
Between two characters in the string,
where one is a word character and the other is not a word character.
I need to add some french translation into iOS Application. But I don know how to use single qute char in the Localizable.strings file.
For example text :
"Invalid username or password."="Nom d'utilisateur ou mot de passe incorrect.";
Causes an error. I've tried adding backslashes, but it havn't worked as well.
Using Special Characters in String Resources Just as in C, some
characters must be prefixed with a backslash before you can include
them in the string. These characters include double quotation marks,
the backslash character itself, and special control characters such as
linefeed (\n) and carriage returns (\r).
From:
https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/LoadingResources/Strings/Strings.html
Did you try escaping it with a backslash?
"\'"
As a last resort you could use direct U codes:
You can include arbitrary Unicode characters in a value string by
specifying \U followed immediately by up to four hexadecimal digits.
The four digits denote the entry for the desired Unicode character;
for example, the space character is represented by hexadecimal 20 and
thus would be \U0020 when specified as a Unicode character. This
option is useful if a string must include Unicode characters that for
some reason cannot be typed. If you use this option, you must also
pass the -u option to genstrings in order for the hexadecimal digits
to be interpreted correctly in the resulting strings file. The
genstrings tool assumes your strings are low-ASCII by default and only
interprets backslash sequences if the -u option is specified.
The apostrophe should be \U0027
I have seen the following on StackOverflow about URL characters:
There are two sets of characters you need to watch out for - Reserved and Unsafe.
The reserved characters are:
ampersand ("&")
dollar ("$")
plus sign ("+")
comma (",")
forward slash ("/")
colon (":")
semi-colon (";")
equals ("=")
question mark ("?")
'At' symbol ("#").
The characters generally considered unsafe are:
space,
question mark ("?")
less than and greater than ("<>")
open and close brackets ("[]")
open and close braces ("{}")
pipe ("|")
backslash ("\")
caret ("^")
tilde ("~")
percent ("%")
pound ("#").
I'm trying to code a URL so I can parse it using delimiters. They can't be numbers or letters though. Does anyone have a list of characters that are NOT Reserved but ARE safe to use?
Thanks for any help you can provide.
Don't bother trying to use safe/unreserved characters. Just use whatever delimiters you want and URLencode the whole thing. Then URL decode it on the other end and parse normally.
Is there a reason you can't just use the standard delimiter for URL parameters (&)? That is the most straightforward way to do it instead of trying to roll your own.
For example the standard URL syntax already allows for multi-valued paramaters natively. This is perfectly legal and doesn't require any trickery.
Somepage.aspx?parameterName=A¶meterName=B
The result is that the page would be passed "A,B" in the parameterName attribute.