Parse String respresentation of byte array in JavaME - blackberry

I sending a byte array over a REST service. It is being received as String. Here is an extract of it. with start and end tags.
[0,0,0,0,32,122,26,65,0,0,0,0,96,123,26,65,0,0,0,0,192,123,20,65,0,0,0,0,0,125,20,65,71,73,70,56,57,97,244,1,244,1,247,0,0,51,85,51,51,85,102,51,85,153,51,85,204,51,85,255,51,128,0,51,128,51,51,128,102,51,128,153,51,128,204,51,128,255,51,170,0,51,170,51,51,170,102,51,170,153,51,170,204,51,170,255,51,213,0,51,213,51,51,213,102,51,213,153,51,213,204,51,213,255,51,255,0,51,255,51,51,255,102,51,255,153,51,255,204,51]
Now before anyone suggests sending it as a base64 encoded String, that would require Blackberry to actually have a working Base64 decoder. But alas, it fails for files over 64k and Ive tried alsorts.
Anyway this is what ive tried:
str = str.replace('[', ' ');
str = str.replace(']', ' ');
String[] tokens = split(str,",");
byte[] decoded = new byte[tokens.length];
for(int i = 0; i < tokens.length; i++)
{
decoded[i] = (byte)Integer.parseInt(tokens[i]);
}
But it fails. Where split is like the JAVA implementation found here.
Logically it should work? but its not. This is for JavaME / Blackberry. No Java Answers please (unless they work on javaME).

Two problems one minor and one that is a pain.
Minor:whitespaces (as mentioned by Nikita)
Major:casting to bytes ... since java only has unsigned byte, 128 and higher will become negative numbers when casting from int to byte.
str = str.replace('[',' ');
str = str.replace(']', ' ');
String[] tokens = split(str,",");//String[] tokens = str.split(",");
byte[] decoded = new byte[tokens.length];
for (int i = 0; i < tokens.length; i++) {
decoded[i] = (byte) (Integer.parseInt(tokens[i].trim()) & 0xFF);
}
for(byte b:decoded) {
int tmp = ((int)b) & 0xff;
System.out.print("byte:"+tmp);
}
(btw:implementing base64 encoder/decoder isn't especially hard - might be "overkill" for your project though)

Replace brackets with empty strings, not with spaces:
str = str.replace('[', '');
str = str.replace(']', '');
In your case you have following array:
[" 0", "0", "0", ..., "204", "51 "]
First element " 0" cannot be parsed to Integer.

I recommend to use Base64 encoded string to send byte array.
There's a post with link to Base64 library for J2ME.
This way allows you convert byte array to a string and later you can convert this string to byte array.

Related

Objective-C how to convert a keystroke to ASCII character code?

I need to find a way to convert an arbitrary character typed by a user into an ASCII representation to be sent to a network service. My current approach is to create a lookup dictionary and send the corresponding code. After creating this dictionary, I see that it is hard to maintain and determine if it is complete:
__asciiKeycodes[#"F1"] = #(112);
__asciiKeycodes[#"F2"] = #(113);
__asciiKeycodes[#"F3"] = #(114);
//...
__asciiKeycodes[#"a"] = #(97);
__asciiKeycodes[#"b"] = #(98);
__asciiKeycodes[#"c"] = #(99);
Is there a better way to get ASCII character code from an arbitrary key typed by a user (using standard 104 keyboard)?
Objective C has base C primitive data types. There is a little trick you can do. You want to set the keyStroke to a char, and then cast it as an int. The default conversion in c from a char to an int is that char's ascii value. Here's a quick example.
char character= 'a';
NSLog("a = %ld", (int)test);
console output = a = 97
To go the other way around, cast an int as a char;
int asciiValue= (int)97;
NSLog("97 = %c", (char)asciiValue);
console output = 97 = a
Alternatively, you can do a direct conversion within initialization of your int or char and store it in a variable.
char asciiToCharOf97 = (char)97; //Stores 'a' in asciiToCharOf97
int charToAsciiOfA = (int)'a'; //Stores 97 in charToAsciiOfA
This seems to work for most keyboard keys, not sure about function keys and return key.
NSString* input = #"abcdefghijklkmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!##$%^&*()_+[]\{}|;':\"\\,./<>?~ ";
for(int i = 0; i<input.length; i ++)
{
NSLog(#"Found (at %i): %i",i , [input characterAtIndex:i]);
}
Use stringWithFormat call and pass the int values.

C++ - Removing invalid characters when a user paste in a grid

Here's my situation. I have an issue where I need to filter invalid characters that a user may paste from word or excel documents.
Here is what I'm doing.
First I'm trying to convert any unicode characters to ascii
extern "C" COMMON_STRING_FUNCTIONS long ConvertUnicodeToAscii(wchar_t * pwcUnicodeString, char* &pszAsciiString)
{
int nBufLen = WideCharToMultiByte(CP_ACP, 0, pwcUnicodeString, -1, NULL, 0, NULL, NULL)+1;
pszAsciiString = new char[nBufLen];
WideCharToMultiByte(CP_ACP, 0, pwcUnicodeString, -1, pszAsciiString, nBufLen, NULL, NULL);
return nBufLen;
}
Next I'm filtering out any character that does not have a value between 31 and 127
String __fastcall TMainForm::filterInput(String l_sConversion)
{
// Used to store every character that was stripped out.
String filterChars = "";
// Not Used. We never received the whitelist
String l_SWhiteList = "";
// Our String without the invalid characters.
AnsiString l_stempString;
// convert the string into an array of chars
wchar_t* outputChars = l_sConversion.w_str();
char * pszOutputString = NULL;
//convert any unicode characters to ASCII
ConvertUnicodeToAscii(outputChars, pszOutputString);
l_stempString = (AnsiString)pszOutputString;
//We're going backwards since we are removing characters which changes the length and position.
for (int i = l_stempString.Length(); i > 0; i--)
{
char l_sCurrentChar = l_stempString[i];
//If we don't have a valid character, filter it out of the string.
if (((unsigned int)l_sCurrentChar < 31) ||((unsigned int)l_sCurrentChar > 127))
{
String l_sSecondHalf = "";
String l_sFirstHalf = "";
l_sSecondHalf = l_stempString.SubString(i + 1, l_stempString.Length() - i);
l_sFirstHalf = l_stempString.SubString(0, i - 1);
l_stempString = l_sFirstHalf + l_sSecondHalf;
filterChars += "\'" + ((String)(unsigned int)(l_sCurrentChar)) + "\' ";
}
}
if (filterChars.Length() > 0)
{
LogInformation(__LINE__, __FUNC__, Utilities::LOG_CATEGORY_GENERAL, "The Following ASCII Values were filtered from the string: " + filterChars);
}
// Delete the char* to avoid memory leaks.
delete [] pszOutputString;
return l_stempString;
}
Now this seems to work except, when you try to copy and past bullets from a word document.
o Bullet1:
 subbullet1.
You will get something like this
oBullet1?subbullet1.
My filter function is called on an onchange event.
The bullets are replaced with the value o and a question mark.
What am I doing wrong, and is there a better way of trying to do this.
I'm using c++ builder XE5 so please no Visual C++ solutions.
When you perform the conversion to ASCII (which is not actually converting to ASCII, btw), Unicode characters that are not supported by the target codepage are lost - either dropped, replaced with ?, or replaced with a close approximation - so they are not available to your scanning loop. You should not do the conversion at all, scan the source Unicode data as-is instead.
Try something more like this:
#include <System.Character.hpp>
String __fastcall TMainForm::filterInput(String l_sConversion)
{
// Used to store every character sequence that was stripped out.
String filterChars;
// Not Used. We never received the whitelist
String l_SWhiteList;
// Our String without the invalid sequences.
String l_stempString;
int numChars;
for (int i = 1; i <= l_sConversion.Length(); i += numChars)
{
UCS4Char ch = TCharacter::ConvertToUtf32(l_sConversion, i, numChars);
String seq = l_sConversion.SubString(i, numChars);
//If we don't have a valid codepoint, filter it out of the string.
if ((ch <= 31) || (ch >= 127))
filterChars += (_D("\'") + seq + _D("\' "));
else
l_stempString += seq;
}
if (!filterChars.IsEmpty())
{
LogInformation(__LINE__, __FUNC__, Utilities::LOG_CATEGORY_GENERAL, _D("The Following Values were filtered from the string: ") + filterChars);
}
return l_stempString;
}

CP037 encoding in blackberry

As CP037 encoding is not supported by BlackBerry by default does anyone know if there is any ready made libaray that I would be able to use? I've had a look online and I can't seem to see anything. Is the only option to write one myself? Does anyone have any tips on how to do such a thing?
Writing your own bytes -> String decoder seems pretty straightforward, as the encoding has no more than 256 characters. Just turn the table from Wikipedia into a switch statement, and accumulate the resulting characters into a String.
byte[] rawCP037data = getEbcdicDatabytes();
StringBuffer buf = new StringBuffer();
for(int i = 0; i < rawCP037data.length; i++) {
buf.append(convertCP037toUnicodeChar(rawCP037data[i]));
}
String decodedString = buf.toString();
char convertCP037toChar(byte b) {
switch (b) {
case 0x99:
return 'r';
case 0xAB: // upside down question mark
return 0x00BF;
// TODO! fill out the rest of the table here
}
}

iOS - XML to NSString conversion

I'm using NSXMLParser for parsing XML to my app and having a problem with the encoding type. For example, here is one of the feeds coming in. It looks similar to this"
\U2026Some random text from the xml feed\U2026
I am currently using the encoding type:
NSData *data = [string dataUsingEncoding:NSUTF8StringEncoding];
Which encoding type am I suppose to use for converting \U2026 into a ellipse (...) ??
The answer here is you're screwed. They are using a non-standard encoding for XML, but what if they really want the literal \U2026? Let's say you add a decoder to handle all \UXXXX and \uXXXX encodings. What happens when another feed want the data to be the literal \U2026?
You're first choice and best bet is to get this feed fixed. If they need to encode data, they need to use proper HTML entities or numeric references.
As a fallback, I would isolate the decoder away from the XML parser. Don't create a non-conforming XML parser just because your getting non-conforming data. Have a post processor that would only be run on the offending feed.
If you must have a decoder, then there is more bad news. There is no built in decoder, you will need to find a category online or write one up yourself.
After some poking around, I think Using Objective C/Cocoa to unescape unicode characters, ie \u1234 may work for you.
Alright, heres a snippet of code that should work for any unicode code-point:
NSString *stringByUnescapingUnicodeSymbols(NSString *input)
{
NSMutableString *output = [NSMutableString stringWithCapacity:[input length]];
// get the UTF8 string for this string...
const char *UTF8Str = [input UTF8String];
while (*UTF8Str) {
if (*UTF8Str == '\\' && tolower(*(UTF8Str + 1)) == 'u')
{
// skip the next 2 chars '\' and 'u'
UTF8Str += 2;
// make sure we only read 4 chars
char tmp[5] = { UTF8Str[0], UTF8Str[1], UTF8Str[2], UTF8Str[3], 0 };
long unicode = strtol(tmp, NULL, 16); // remember that Unicode is base 16
[output appendFormat:#"%C", unicode];
// move on with the string (making sure we dont miss the end of the string
for (int i = 0; i < 4; i++) {
if (*UTF8Str == 0)
break;
UTF8Str++;
}
}
else
{
if (*UTF8Str == 0)
break;
[output appendFormat:#"%c", *UTF8Str];
}
UTF8Str++;
}
return output;
}
You should simple replace literal '\U2026' on a quotation, then encode it with NSUTF8StringEncoding encodind to NSData

MD5 with ASCII Char

I have a string
wDevCopyright = [NSString stringWithFormat:#"Copyright: %c 1995 by WIRELESS.dev, Corp Communications Inc., All rights reserved.",0xa9];
and to munge it I call
-(NSString *)getMD5:(NSString *)source
{
const char *src = [source UTF8String];
unsigned char result[CC_MD5_DIGEST_LENGTH];
CC_MD5(src, strlen(src), result);
return [NSString stringWithFormat:
#"%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x",
result[0], result[1], result[2], result[3],
result[4], result[5], result[6], result[7],
result[8], result[9], result[10], result[11],
result[12], result[13], result[14], result[15]
]; //ret;
}
because of 0xa9 *src = [source UTF8String] does not create a char that represents the string, thus returning a munge that is not comparable with other platforms.
I tried to encode the char with NSASCIIStringEncoding but it broke the code.
How do I call CC_MD5 with a string that has ASCII characters and get the same hash as in Java?
Update to code request:
Java
private static char[] kTestASCII = {
169
};
System.out.println("\n\n>>>>> msg## " + (char)0xa9 + " " + (char)169 + "\n md5 " + md5(new String(kTestASCII), false) //unicode = false
Result >>>>> msg## \251 \251
md5 a252c2c85a9e7756d5ba5da9949d57ed
ObjC
char kTestASCII [] = {
169
};
NSString *testString = [NSString stringWithCString:kTestASCII encoding:NSUTF8StringEncoding];
NSLog(#">>>> objC msg## int %d char %c md5: %#", 0xa9, 169, [self getMD5:testString]);
Result >>>> objC msg## int 169 char © md5: 9b759040321a408a5c7768b4511287a6
** As stated earlier - without the 0xa9 the hashes in Java and ObjC are the same. I am trying to get the hash for 0xa9 the same in Java and ObjC
Java MD5 code
private static char[] kTestASCII = {
169
};
md5(new String(kTestASCII), false);
/**
* Compute the MD5 hash for the given String.
* #param s the string to add to the digest
* #param unicode true if the string is unciode, false for ascii strings
*/
public synchronized final String md5(String value, boolean unicode)
{
MD5();
MD5.update(value, unicode);
return WUtilities.toHex(MD5.finish());
}
public synchronized void update(String s, boolean unicode)
{
if (unicode)
{
char[] c = new char[s.length()];
s.getChars(0, c.length, c, 0);
update(c);
}
else
{
byte[] b = new byte[s.length()];
s.getBytes(0, b.length, b, 0);
update(b);
}
}
public synchronized void update(byte[] b)
{
update(b, 0, b.length);
}
//--------------------------------------------------------------------------------
/**
* Add a byte sub-array to the digest.
*/
public synchronized void update(byte[] b, int offset, int length)
{
for (int n = offset; n < offset + length; n++)
update(b[n]);
}
/**
* Add a byte to the digest.
*/
public synchronized void update(byte b)
{
int index = (int)((count >>> 3) & 0x03f);
count += 8;
buffer[index] = b;
if (index >= 63)
transform();
}
I believe that my issue is with using NSData withEncoding as opposed to a C char[] or the Java byte[]. So what is the best way to roll my own bytes into a byte[] in objC?
The character you are having problems with, ©, is the Unicode COPYRIGHT SIGN (00A9). The correct UTF-8 encoding of this character is the byte sequence 0xc9 0xa9.
You are attempting, however to convert from the single-byte sequence 0xa9 which is not a valid UTF-8 encoding of any character. See table 3-7 of http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf#G7404 . Since this is not a valid UTF-8 byte sequence, stringWithCString is converting your input to the Unicode REPLACEMENT_CHARACTER (FFFD). When this character is then encoded back into UTF-8, it yields the byte sequence 0xef 0xbf 0xbd. The MD5 of this sequence is 9b759040321a408a5c7768b4511287a6 as reported by your Objective-C example.
Your Java example yields an MD5 of a252c2c85a9e7756d5ba5da9949d57ed, which simple experimentation shows is the MD5 of the byte sequence 0xa9, which I have already noted is not a valid UTF-8 representation of the desired character.
I think we need to see the implementation of the Java md5() method you are using. I suspect it is simply dropping the high bytes of every Unicode character to convert to a byte sequence for passing to the MessageDigest class. This does not match your Objective-C implementation where you are using a UTF-8 encoding.
Note: even if you fix your Objective-C implementation to match the encoding of your Java md5() method, your test will need some adjustment because you cannot use stringWithCString with the NSUTF8StringEncoding encoding to convert the byte sequence 0xa9 to an NSString.
UPDATE
Having now seen the Java implementation using the deprecated getBytes method, my recommendation is to change the Java implementation, if at all possible, to use a proper UTF-8 encoding.
I suspect, however, that your requirements are to match the current Java implementation, even if it is wrong. Therefore, I suggest you duplicate the bad behavior of Java's deprecated getBytes() method by using NSString getCharacters:range: to retrieve an array of unichars, then manually create an array of bytes by taking the low byte of each unichar.
stringWithCString requires a null terminated C-String. I don't think that kTestASCII[] is necessarily null terminated in your Objective-C code. Perhaps that is the cause of the difference.
Try:
char kTestASCII [] = {
169,
0
};
Thanks to GBegan's explanation - here is my solution
for(int c = 0; c < [s length]; c++){
int number = [s characterAtIndex:c];
unsigned char c[1];
c[0] = (unsigned char)number;
NSMutableData *oneByte = [NSMutableData dataWithBytes:&c length:1];
}

Resources