Replace double backslash in Dart - dart

I have this escaped string:
\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438
\u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438
If I do:
print('\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438 \u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438');
Console will show me:
Для продажи недвижимости
But if I get escaped 2 times string from the server:
\\u0414\\u043B\\u044F
\\u043F\\u0440\\u043E\\u0434\\u0430\\u0436\\u0438
\\u043D\\u0435\\u0434\\u0432\\u0438\\u0436\\u0438\\u043C\\u043E\\u0441\\u0442\\u0438
And do some replace job:
var result = string.replaceAll(new RegExp(r'\\'), r'\');
Compiler will not decode those characters and will show same escaped string:
print(result);
Console:
\u0414\u043B\u044F \u043F\u0440\u043E\u0434\u0430\u0436\u0438
\u043D\u0435\u0434\u0432\u0438\u0436\u0438\u043C\u043E\u0441\u0442\u0438
How I can remove those redunant slashes?

In string literals in Dart source files, \u0414 is a literal representing a unicode code point, whereas in the case of data returned from the server, you're just getting back a string containing backslashes, us, and digits that looks like a bunch of unicode code point literals.
The ideal fix is to have your server return the UTF-8 string you'd like to display rather than a string that uses Dart's string literal syntax that you need to parse. Writing a proper parser for such strings is fairly involved. You can take a look at unescapeCodeUnits in the Dart SDK for an example.
A very inefficient (not to mention entirely hacky and unsafe for real-world use) means of decoding this particular string would be to extract the string representations of the unicode codepoints with a RegExp parse the hex to an int, then use String.fromCharCode().
Note: the following code is absolutely not safe for production use and doesn't match other valid Dart code point literals such as \u{1f601}, or reject entirely invalid literals such as \uffffffffff.
// Match \u0123 substrings (note this will match invalid codepoints such as \u123456789).
final RegExp r = RegExp(r'\\\\u([0-9a-fA-F]+)');
// Sample string to parse.
final String source = r'\\u0414\\u043B\\u044F \\u043F\\u0440\\u043E\\u0434\\u0430\\u0436\\u0438 \\u043D\\u0435\\u0434\\u0432\\u0438\\u0436\\u0438\\u043C\\u043E\\u0441\\u0442\\u0438';
// Replace each \u0123 with the decoded codepoint.
final String decoded = source.replaceAllMapped(r, (Match m) {
// Extract the parenthesised hex string. '\\u0123' -> '123'.
final String hexString = m.group(1);
// Parse the hex string to an int.
final int codepoint = int.parse(hexString, radix: 16);
// Convert codepoint to string.
return String.fromCharCode(codepoint);
});

Related

codeUnits property vs utf8.encode function in Dart

I have this little code:
void main(List<String> args) {
const data = 'amigo+/=:chesu';
var encoded = base64Encode(utf8.encode(data));
var encoded2 = base64Encode(data.codeUnits);
var decoded = utf8.decode(base64Decode(encoded));
var decoded2 = utf8.decode(base64Decode(encoded2));
print(encoded);
print(encoded2);
print(decoded);
print(decoded2);
}
The output is:
YW1pZ28rLz06Y2hlc3U=
YW1pZ28rLz06Y2hlc3U=
amigo+/=:chesu
amigo+/=:chesu
codeUnits property gives an unmodifiable list of the UTF-16 code units, is it OK to use utf8.decode function? or what function should be used for encoded2?
It's simply not a good idea to do base64Encode(data.codeUnits) because base64Encode encodes bytes, and data.codeUnits isn't necessarily bytes.
Here they are (because all the characters of the string have code points below 256, they are even ASCII.)
Using ut8.encode before base64Encode is good. It works for all strings.
The best way to convert from UTF-16 code units to a String is String.fromCharCodes.
Here you are using base64Encode(data.codeUnits) which only works if the data string contains only code units up to 255. So, if you assume that, then it means that decoding that can be done using either latin1.decode or String.fromCharCodes.
Using ascii.decode and utf8.decode also works if the string only contains ASCII (which it does here, but which isn't guaranteed by base64Encode succeeding).
In short, don't do base64Encode(data.codeUnits). Convert the string to bytes before doing base64Encode, then use the reverse conversion to convert bytes back to strings.
I tried this
print(utf8.decode('use âsmartâ symbols like â thisâ'.codeUnits));
and got this
use “smart” symbols like ‘ this’
The ” and ‘ are smart characters from iOS keyboard

How to convert string to raw string in dart

I want to convert an existing string to raw string.
like:
String s = "Hello \n World"
I want to convert this s variable to raw string(I want to print exact "Hello \n Wrold")
I need backslash(\) in output. I am trying to fetch string value from rest api. it have bunch of mathjax(latex) formula containing backslash.
Thanks
You are asking for a way to escape newlines (and possibly other control characters) in a string value.
There is no general way to do that for Dart strings in the platform libraries, but in most cases, using jsonEncode is an adequate substitute.
So, given your string containing a newline, you can convert it to a string containing \n (a backslash and an n) as var escapedString = jsonEncode(string);. The result is also wrapped in double-quotes because it really is a JSON string literal. If you don't want that, you can drop the first and last character: escapedString = escapedString.substring(1, escapedString.length - 1);.
Alternatively, if you only care about newlines, you can just replace them yourself:
var myString = string.replaceAll("\n", r"\n");

Swift: getting character from string (ascii value)

I'm trying to get the character value in ascii and also the character at index.
I have this Objective-C any of you would know the conversion to swift?
po [strToSort characterAtIndex:i] // character x
U+0078 u'x'
po [strToSort UTF8String][i]
x
I'll really appreciate your help.
Updated: you can directly subscript string with an Index
Swift doesn't allow you to subscript Strings with an Integer index. Instead you can construct an index to pass in.
let str = "String with some characters"
let index = str.startIndex.advancedBy(5)
let character = str[index]
print(character) // "g"
For more information on why you can't treat strings as a direct sequence of characters, you can find more info here.
Essentially to be properly unicode compliant, sometimes multiple characters can be combined to create a single character in the final string. This causes issues with naive counting and indexing.
If you want a utf8 representation of the string, String provides a utf8 property as well as a unicodeScalars property for getting the code point for each character.

Swift - remove single backslash

this is maybe stupid question but I'm new to swift and i actually can't figure this out.
I have API which returns url as string "http:\/\/xxx". I don't know how to store URL returned from API in this format. I can't store it to variable because of backslash.
From apple doc:
...string cannot contain an unescaped backslash (\), ...
Is there any way how to store string like this or how remove these single backslashes or how to work with this?
Thank you for every advice.
You can just replace those backslashes, for example:
let string2 = string1.stringByReplacingOccurrencesOfString("\\", withString: "")
Or, to avoid the confusion over the fact that the backslash within a normal string literal is escaped with yet another backslash, we can use an extended string delimiter of #" and "#:
let string2 = string1.stringByReplacingOccurrencesOfString(#"\"#, withString: "")
But, if possible, you really should fix that API that is returning those backslashes, as that's obviously incorrect. The author of that code was apparently under the mistaken impression that forward slashes must be escaped, but this is not true.
Bottom line, the API should be fixed to not insert these backslashes, but until that's remedied, you can use the above to remove any backslashes that may occur.
In the discussion in the comments below, there seems to be enormous confusion about backslashes in strings. So, let's step back for a second and discuss "string literals". As the documentation says, a string literal is:
You can include predefined String values within your code as string literals. A string literal is a fixed sequence of textual characters surrounded by a pair of double quotes ("").
Note, a string literal is just a representation of a particular fixed sequence of characters in your code. But, this should not be confused with the underlying String object itself. The key difference between a string literal and the underlying String object is that a string literal allows one to use a backslash as an "escape" character, used when representing special characters (or doing string interpolation). As the documentation says:
String literals can include the following special characters:
The escaped special characters \0 (null character), \\ (backslash), \t (horizontal tab), \n (line feed), \r (carriage return), \" (double quote) and \' (single quote)
An arbitrary Unicode scalar, written as \u{n}, where n is a 1–8 digit hexadecimal number with a value equal to a valid Unicode code point
So, you are correct that in a string literal, as the excerpt you quoted above points out, you cannot have an unescaped backslash. Thus, whenever you want to represent a single backslash in a string literal, you represent that with a \\.
Thus the above stringByReplacingOccurrencesOfString means "look through the string1, find all occurrences of a single backslash, and replace them with an empty string (i.e. remove the backslash)."
Consider:
let string1 = "foo\\bar"
print(string1) // this will print "foo\bar"
print(string1.characters.count) // this will print "7", not "8"
let string2 = string1.stringByReplacingOccurrencesOfString("\\", withString: "")
print(string2) // this will print "foobar"
print(string2.characters.count) // this will print "6"
A little confusingly, if you look at string1 in the "Variables" view of the "Debug" panel or within playground, it will show a string literal representation (i.e. backslashes will appear as "\\"). But don't be confused. When you see \\ in the string literal, there is actually only a single backslash within the actual string. But if you print the value or look at the actual characters, there is only a single backslash in the string, itself.
In short, do not conflate the escaping of the backslash within a string literal (for example, the parameters to stringByReplacingOccurrencesOfString) and the single backslash that exists in the underlying string.
I found I was having this same issue when trying to encode my objects to JSON. Depending on if you're using the newer JSONEncoder class to parse your JSON and you're supporting a minimum of iOS 13, you can use the .withoutEscapingSlashes output formatting:
let encoder = JSONEncoder()
encoder.outputFormatting = .withoutEscapingSlashes
try encoder.encode(yourJSONObject)
Please check the below code.
let jsonStr = "[{\"isSelected\":true,\"languageProficiencies\":[{\"isSelected\":true,\"name\":\"Advance\"},{\"isSelected\":false,\"name\":\"Proficient\"},{\"isSelected\":false,\"name\":\"Basic\"},{\"isSelected\":false,\"name\":\"Below Basic\"}],\"name\":\"English\"},{\"isSelected\":false,\"languageProficiencies\":[{\"isSelected\":false,\"name\":\"Advance\"},{\"isSelected\":false,\"name\":\"Proficient\"},{\"isSelected\":false,\"name\":\"Basic\"},{\"isSelected\":false,\"name\":\"Below Basic\"}],\"name\":\"Malay\"},{\"isSelected\":false,\"languageProficiencies\":[{\"isSelected\":false,\"name\":\"Advance\"},{\"isSelected\":false,\"name\":\"Proficient\"},{\"isSelected\":false,\"name\":\"Basic\"},{\"isSelected\":false,\"name\":\"Below Basic\"}],\"name\":\"Chinese\"},{\"isSelected\":false,\"languageProficiencies\":[{\"isSelected\":false,\"name\":\"Advance\"},{\"isSelected\":false,\"name\":\"Proficient\"},{\"isSelected\":false,\"name\":\"Basic\"},{\"isSelected\":false,\"name\":\"Below Basic\"}],\"name\":\"Tamil\"}]"
let convertedStr = jsonStr.replacingOccurrences(of: "\\", with: "", options: .literal, range: nil)
print(convertedStr)
I've solved with this piece of code:
let convertedStr = jsonString.replacingOccurrences(of: "\\/", with: "/")
To remove single backslash,try this
let replaceStr = backslashString.replacingOccurrences(of: "\"", with: "")
Include a backslash in a string by adding an extra backslash.

Convert a string to raw string

It's easy so declare a raw string
String raw = #'\"Hello\"';
But how can I convert a existing string to a raw string ?
This can come with file reading or ajax call : I want read the file as a raw string but the readAsText method give me a no raw string.
I tryied thing like :
String notRaw = '\"Hello\"';
String raw = #raw;
But not compiling.
How can I do this?
EDIT : My need is to read the string char by char. So I don't want to read \" as one char " but as two chars \ and "
If you want to read a file without interpreting escape characters, then you need to readAsBytes, which will give you a list of characters as integers. You can then detect a backslash and quote as:
final int backSlash = #'\'.charCodeAt(0);
final int quote = #'"'.charCodeAt(0);
You then pass the desired substrings to a string constructor:
String goodString = new String.fromCharCodes(byteList.getRange(start, end));
A raw string literal is still a string. The only difference is that it everything within a raw literal is accepted as-is, i.e. no escape sequences.
For example, in a plain (non-raw) string literal 'a\nb' represents letter 'a', newline, letter 'b'; in a raw string literal it represents letter 'a', backslash, letter 'n', letter 'b'.

Resources