Splitting a string in ActionScript - actionscript

In ActionScript, you can pass in an empty delimiter for the split method and it will break the string up into an array, like so:
var myString:* = "Test";
var myArray:* = myString.split("");
// myArray contains "T", "e", "s", "t"
What I'm wondering is if the array is Unicode characters, containing 8 bytes or ASCII characters, containing 4 bytes?

(Note: You're okay with the posted code because AS3 follows ECMAscript conventions same as JS but...) Another way to look at your code, the AS3 way...
var Variable-Name : Type = Value;
var myString:String = "Test";
var myArray:Array = myString.split("");
// myArray contains "T", "e", "s", "t"
Arrays in AS3 are just that, a grouped collection of items! They can be Strings, Numbers or Bytes or whatever. A single Array object can have mixed item types at different indices.
You asked about bytes though so I hope you understand this Array result from "splitting" a String is not the same thing as a ByteArray. It's a "generic" array so to speak.
The default mode for Flash String seems to be Unicode/UTF-8 from experience. So technically you have 4 bytes until.. you decide to actually make a real byte array and choosing how you write the string as bytes will determine if you have 4 or 8 in the end...
var myString:String = "Test";
var myBytes:ByteArray = new ByteArray();
myBytes.writeMultiByte( myString, "ASCII" ); //length = 4 bytes
myBytes.writeMultiByte( myString, "Unicode" ); //length = 8 bytes
PS: Flash considers both UTF-8 (4 bytes) and UTF-16 (8 bytes) as Unicode. Putting "Unicode" in the code makes it default to UTF-16 but you can set instead "UTF-8" to get a 4 bytes result..

Related

Convert string to base64 byte array in swift and java give different value

Incase of android everything is working perfectly. I want to implement same feature in iOS too but getting different values. Please check the description with images below.
In Java/Android Case:
I tried to convert the string to base64 byte array in java like
byte[] data1 = Base64.decode(balance, Base64.DEFAULT);
Output:
In Swift3/iOS Case:
I tried to convert the string to base64 byte array in swift like
let data:Data = Data(base64Encoded: balance, options: NSData.Base64DecodingOptions(rawValue: 0))!
let data1:Array = (data.bytes)
Output:
Finally solved:
This is due to signed and unsigned integer, meaning unsigned vs signed (so 0 to 255 and -127 to 128). Here, we need to convert the UInt8 array to Int8 array and therefore the problem will be solved.
let intArray = data1.map { Int8(bitPattern: $0) }
In no case should you try to compare data on 2 systems the way you just did. That goes for all types but specially for raw data.
Raw data are NOT presentable without additional context which means any system that does present them may choose how to present them (raw data may represent some text in UTF8 or some ASCII, maybe jpeg image or png or raw RGB pixel data, it might be an audio sample or whatever). In your case one system is showing them as a list of signed 8bit integers while the other uses 8bit unsigned integers for the same thing. Another system might for instance show you a hex string which would look completely different.
As #Larme already mentioned these look the same as it is safe to assume that one system uses signed and the other unsigned values. So to convert from signed (Android) to unsigned (iOS) you need to convert negative values as unsigned = 256+signet so for instance -55 => 256 + (-55) = 201.
If you really need to compare data in your case it is the best to save them into some file as raw data. Then transfer that file to another system and compare native raw data to those in file to check there is really a difference.
EDIT (from comment):
Printing raw data as a string is a problem but there are a few ways. The thing is that many bytes are not printable as strings, may be whitespaces or some reserved codes but mostly the problem is that value of 0 means the end of string in most cases which may exist in the middle of your byte sequence.
So you already have 2 ways of printing byte by byte which is showing Int8 or Uint8 corresponding values. As described in comment converting directly to string may not work as easy as
let string = String(data: data, encoding: .utf8) // Will return nil for strange strings
One way of converting data to string may be to convert each byte into a corresponding character. Check this code:
let characterSequence = data.map { UnicodeScalar($0) } // Create an array of characters from bytes
let stringArray = characterSequence.map { String($0) } // Create an array of strings from array of characters
let myString = stringArray.reduce("", { $0 + $1 }) // Convert an array of strings to a single string
let myString2 = data.reduce("", { $0 + String(UnicodeScalar($1)) }) // Same thing in a single line
Then to test it I used:
let data = Data(bytes: Array(0...255)) // Generates with byte values of 0, 1, 2... up to 255
let myString2 = data.reduce("", { $0 + String(UnicodeScalar($1)) })
print(myString2)
The printing result is:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþ
Then another popular way is using a hex string. It can be displayed as:
let hexString = data.reduce("", { $0 + String(format: "%02hhx",$1) })
print(hexString)
And with the same data as before the result is:
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff
I hope this is enough but in general you could do pretty much anything with array of bytes and show them. For instance you could create an image treating bytes as RGB 8-bit per component if it would make sense. It might sound silly but if you are looking for some patterns it might be quite a witty solution.

What's the best way to transform an Array of type Character to a String in Swift?

This question is specifically about converting an Array of type Character to a String. Converting an Array of Strings or numbers to a string is not the topic of discussion here.
In the following 2 lines, I would expect myStringFromArray to be set to "C,a,t!,🐱"
var myChars: [Character] = ["C", "a", "t", "!", "🐱"]
let myStringFromArray = myChars.joinWithSeparator(",");
However, I can't execute that code because the compiler complains about an "ambiguous reference to member joinWithSeparator".
So, two questions:
1) Apple says,
"Every instance of Swift’s Character type represents a single extended
grapheme cluster. An extended grapheme cluster is a sequence of one or
more Unicode scalars that (when combined) produce a single
human-readable character."
Which to me sounds at least homogeneous enough to think it would be reasonable to implement the joinWithSeparator method to support the Character type. So, does anyone have a good answer as to why they don't do that???
2) What's the best way to transform an Array of type Character to a String in Swift?
Note: if you don't want a separator between the characters, the solution would be:
let myStringFromArray = String(myChars)
and that would give you "Cat!🐱"
Which to me sounds at least homogeneous enough to think it would be reasonable to implement the joinWithSeparator method to support the Character type. So, does anyone have a good answer as to why they don't do that???
This may be an oversight in the design. This error occurs because there are two possible candidates for joinWithSeparator(_:). I suspect this ambiguity exists because of the way Swift can implicit interpret double quotes as either String or Character. In this context, it's ambiguous as to which to choose.
The first candidate is joinWithSeparator(_: String) -> String. It does what you're looking for.
If the separator is treated as a String, this candidate is picked, and the result would be: "C,a,t,!,🐱"
The second is joinWithSeparator<Separator : SequenceType where Separator.Generator.Element == Generator.Element.Generator.Element>(_: Separator) -> JoinSequence<Self>. It's called on a Sequence of Sequences, and given a Sequence as a seperator. The method signature is a bit of a mouthful, so lets break it down. The argument to this function is of Separator type. This Separator is constrained to be a SequenceType where the elements of the sequence (Seperator.Generator.Element) must have the same type as the elements of this sequence of sequences (Generator.Element.Generator.Element).
The point of that complex constraint is to ensure that the Sequence remains homogeneous. You can't join sequences of Int with sequences of Double, for example.
If the separator is treated as a Character, this candidate is picked, the result would be: ["C", ",", "a", ",", "t", ",", "!", ",", "🐱"]
The compiler throws an error to ensure you're aware that there's an ambiguity. Otherwise, the program might behave differently than you'd expect.
You can disambiguate this situation by this by explicitly making each Character into a String. Because String is NOT a SequenceType, the #2 candidate is no longer possible.
var myChars: [Character] = ["C", "a", "t", "!", "🐱"]
var anotherVar = myChars.map(String.init).joinWithSeparator(",")
print(anotherVar) //C,a,t,!,🐱
This answer assumes Swift 2.2.
var myChars: [Character] = ["C", "a", "t", "!", "🐱"]
var myStrings = myChars.map({String($0)})
var result = myStrings.joinWithSeparator(",")
joinWithSeparator is only available on String arrays:
extension SequenceType where Generator.Element == String {
/// Interpose the `separator` between elements of `self`, then concatenate
/// the result. For example:
///
/// ["foo", "bar", "baz"].joinWithSeparator("-|-") // "foo-|-bar-|-baz"
#warn_unused_result
public func joinWithSeparator(separator: String) -> String
}
You could create a new extension to support Characters:
extension SequenceType where Generator.Element == Character {
#warn_unused_result
public func joinWithSeparator(separator: String) -> String {
var str = ""
self.enumerate().forEach({
str.append($1)
if let arr = self as? [Character], endIndex: Int = arr.endIndex {
if $0 < endIndex - 1 {
str.append(Character(separator))
}
}
})
return str
}
}
var myChars: [Character] = ["C", "a", "t", "!", "🐱"]
let charStr = myChars.joinWithSeparator(",") // "C,a,t,!,🐱"
Related discussion on Code Review.SE.
Context: Swift3(beta)
TL;DR Goofy Solution
var myChars:[Character] = ["C", "a", "t", "!", "🐱"]
let separators = repeatElement(Character("-"), count: myChars.count)
let zipped = zip(myChars, separators).lazy.flatMap { [$0, $1] }
let joined = String(zipped.dropLast())
Exposition
OK. This drove me nuts. In part because I got caught up in the join semantics. A join method is very useful, but when you back away from it's very specific (yet common) case of string concatenation, it's doing two things at once. It's splicing other elements in with the original sequence, and then it's flattening the 2 deep array of characters (array of strings) into one single array (string).
The OPs use of single characters in an Array sent my brain elsewhere. The answers given above are the simplest way to get what was desired. Convert the single characters to single character strings and then use the join method.
If you want to consider the two pieces separately though... We start with the original input:
var input:[Character] = ["C", "a", "t", "!", "🐱"]
Before we can splice our characters with separators, we need a collection of separators. In this case, we want a pseudo collection that is the same thing repeated again and again, without having to actually make any array with that many elements:
let separators = repeatElement(Character(","), count: myChars.count)
This returns a Repeated object (which oddly enough you cannot instantiate with a regular init method).
Now we want to splice/weave the original input with the separators:
let zipped = zip(myChars, separators).lazy.flatMap { [$0, $1] }
The zip function returns a Zip2Sequence(also curiously must be instantiated via free function rather than direct object reference). By itself, when enumerated the Zip2Sequence just enumerates paired tuples of (eachSequence1, eachSequence2). The flatMap expression turns that into a single series of alternating elements from the two sequences.
For large inputs, this would create a largish intermediary sequence, just to be soon thrown away. So we insert the lazy accessor in there which lets the transform only be computed on demand as we're accessing elements from it (think iterator).
Finally, we know we can make a String from just about any sort of Character sequence. So we just pass this directly to the String creation. We add a dropLast() to avoid the last comma being added.
let joined = String(zipped.dropLast())
The valuable thing about decomposing it this way (it's definitely more lines of code, so there had better be a redeeming value), is that we gain insight into a number of tools we could use to solve problems similar, but not identical, to join. For example, say we want the trailing comma? Joined isn't the answer. Suppose we want a non constant separator? Just rework the 2nd line. Etc...

get length of string in UTF8

How can I get the length (not number of bytes) of a string in its UTF-8 encoded form (PHP's mb_strlen(.., 'UTF-8') equivalent)?
I tried string.characters.count but it does not return the correct length for certain characters like an emoji.
Example:
let s = "✌🏿️"
print(s.characters.count) // prints 2, but should print 3.
You can access the UTF-8 encoding of a string with the .utf8 property. Use count on that to get the number of UTF-8 code units in the string:
let string = "\u{1f603}" // One of the smiley face emojis...
print(string.utf8.count) // prints "4"
Based on your edited question, what you are probably looking for is the number of UnicodeScalars used to encode the string. You access that with the unicodeScalars property:
let s = "✌🏿️"
print(s.unicodeScalars.count) // prints 3
The reason everyone is confused is because your original question asks for the length of the string in its UTF-8 encoded form. The answer that you actually wanted had nothing to do with the length of the string in its UTF-8 encoded form.
I think you are confused about the difference between Unicode "extended grapheme clusters", Unicode code points, and the various encodings (like UTF-8) that can be used to encode a Unicode code point.
A Character in Swift represents what Unicode calls an "extended grapheme cluster". That is to say, it is a single visual character, even if it is made up of multiple Unicode code points.
A Unicode code point is a single linguistic symbol that is given a 32-bit value. Two or more Unicode code points can combine to create a single Character. In Swift, the Unicode code point is represented by the UnicodeScalar type.
When it comes time to store a string, or send it over the internet, or otherwise turn it into data that is represented by bytes, you have to decide how to encode it. There are all kinds of encodings, the most common is probably UTF-8, which encodes the string as a series of UInt8 values.
That's just a brief snippet of the difference between the three concepts. It is actually a really interesting subject and if you Google some of those terms, you will find a lot more good information.
let str = "ačŘ"
print("str has \(str.characters.count) characters") // 3
print("and \(str.utf8.count) bytes as encoded in UTF-8") // 5
update (based on your notes)
let s = "✌🏿️"
let arr:[UInt8] = [226, 156, 140, 240, 159, 143, 191, 239, 184, 143]
var arrCchar = arr.map { (uint8) -> Int8 in
Int8(bitPattern: uint8)
}
arrCchar += [0] // to be null terminated
let str = String.fromCString(&arrCchar)
print(str) // Optional("✌🏿️")
s == str // TRUE !!!!
by characters
s.characters.forEach { (c) -> () in
let str = String(c)
print(str.utf8.map{$0}, "which represents character: ", c)
str.unicodeScalars.forEach({ (u) -> () in
print("composed from unicode scalar(s): ", u.debugDescription)
})
}
/*
[226, 156, 140] which represents character: ✌
composed from unicode scalar(s): "\u{270C}"
[240, 159, 143, 191, 239, 184, 143] which represents character: 🏿️
composed from unicode scalar(s): "\u{0001F3FF}"
composed from unicode scalar(s): "\u{FE0F}"
*/
Every character in Unicode can be represented by one or more unicode scalars. A unicode scalar is a unique 21-bit number (and name) for a character or modifier, such as U+0061 for LOWERCASE LATIN LETTER A("a"), or U+1F425 for FRONT-FACING BABY CHICK ("\U0001f425").
When a Unicode string is written to a text file or some other storage, these unicode scalars are encoded in one of several Unicode-defined formats. Each format encodes the string in small chunks known as code units. These include the UTF-8 format (which encodes a string as 8-bit code units) and the UTF-16 format (which encodes a string as 16-bit code units).
//copy from Apple Developer swift programming guide

How can I convert a string to a char array in ActionScript 3?

How do you convert a string into a char array in ActionScript 3.0?
I tried the below code but I get an error:
var temp:ByteArray = new ByteArray();
temp = input.toCharArray();
From the error, I understand that the toCharArray() function cannot be applied to a string (i.e in my case - input). Please help me out. I am a beginner.
I am not sure if this helps your purpose but you can use String#split():
If you use an empty string ("") as a delimiter, each character in the string is placed as an element in the array.
var array:Array = "split".split("");
Now you can get individual elements using index
array[0] == 's' ; array[1] == 'p' ....
Depending on what you need to do with it, the individual characters can also be accessed with string.charAt(index), without splitting them into an array.

How does ActionScript's writeInt work with Integers?

I wanted to know how does writeInt treat a 32 bit unsigned or a signed integer passed to it?
It is easy to understand that how it works with a hexadecimal number. Util.Print will print the corresponding ASCII Characters.
0x41424344 will be broken down into 4 1 byte characters, A, B, C and D.
It seems like its different when an integer is passed to writeInt.
for instance,
var test: ByteArray = new ByteArray();
test.writeInt(0x41424344); // prints ABCD
test.writeInt(2590463591); // prints gVg
test.writeInt(1119885898); // prints BÀJ
I am unclear how the Util.Print function treats the integers written into the ByteArray by writeInt.
The characters, gVg do not correspond to the integer number, 2590463591
According to the definition of writeInt here:
http://livedocs.adobe.com/livecycle/es/sdkHelp/common/langref/flash/utils/ByteArray.html#writeInt%28%29
It states that it works with a 32 Bit Signed Integer.
If someone can elaborate over how it translates the integers to characters, it would be helpful.
EDIT: And how does it handle negative integers?
For instance,
test.writeInt(-11338743); // prints ÿRü
So,
-11338743 = 0xFF52FC09
is that correct?
Thanks.
If you interpret encoded bytes as ASCII
dec hex ascii
1094861636 = 0x41424344 = ABCD
2590463591 = 0x9A675667 = gVg
1119885898 = 0x42C01A4A = BÀJ
Also, note that int vs unsigned int would implement different functions:
var test:ByteArray = new ByteArray();
test.writeInt(0x41424344);
test.writeUnsignedInt(0x41424344);

Resources