Checksum calculation in visual basic - checksum

I want to communicate with a medical analyzer and send some messages to it. But the analyzer requests a control character or checksum at the end of the message to validate it.
You'll excuse my limited knowledge of English, but according to the manual, here is how to calculate that checksum:
The control character is the exclusion logic sum (exclusive OR), in character units, from the start of the text to the end of the text and consists of a 1 byte binary. Moreover, since this control character becomes the value of 00~7F by hexadecimal, please process not to mistake for the control code used in transmission.
So please can you tell me how to get this control character based on those informations. I did not understand well what is written because of my limited English.
I'm using visual basic for programming
Thanks

The manual isn't in very good English either. My understanding:
The message you will send to the device can be represented as a byte array. You want to XOR together every single byte, which will leave you with a one byte checksum, to be added to the end of the message. (byte 1 XOR byte 2 XOR byte 3 XOR ....)
Everything after "Moreover" appears to say "If you are reading from the device, the final character is a checksum, not a command, so do not treat it as a command, even if it looks like one.". You are writing to the device, so you can ignore that part.
To XOR (bear in mind I don't know VB):
have a checksum variable, set to 0. Loop over each byte of the message, and XOR the checksum variable with that byte. Store the result back in the checksum.

Related

Calculation of Internet Checksum of two 16-bit streams

I want to calculate Internet checksum of two bit streams of 16 bits each. Do I need to break these strings into segments or I can directly sum the both?
Here are the strings:
String 1 = 1010001101011111
String 2 = 1100011010000110
Short answer
No. You don't need to split them.
Somewhat longer answer
Not sure exactly what you mean by "internet" checksum (a hash or checksum is just the result of a mathematical operation, and has no direct relation or dependence on the internet), but anyway:
The checksum of any value should not depend on the length of the input. In theory, your input strings could be of any length at all.
You can test this with a basic online checksum generator such as this one, for instance. That appears to generate a whole slew of checksums using lots of different algorithms. The names of the algorithms appear on the left in the list.
If you want to do this in code, a good starting point might be to search for examples using one of them in whatever language / environment you are working in.

How do I get a number from bytes?

I am currently trying to work around with Lua 5.1 bytecode. I've gotten pretty far, and understand a lot. However, I am stuck with a question on instructions and numbers. I understand that the size of the instruction and number are located and defined in the header, but I am not sure how to get the actual number from the 4 bytes (or whatever size is specified in the header).
I've looked at output from ChunkSpy and I don't really understand how it went from those bytes to the number. I'd look in the source but I don't want to just copy it, I want to understand it. If anyone could tell me a bit about it or even point me in the right direction I'd be very grateful.
Thank you!
From A No-Frills Introduction to Lua 5.1 VM Instructions, numbers are stored in the constants pool.
The first byte is 3=LUA_TNUMBER.
The next bytes are the number, with the length as given in the header. Interpretation is based on the length, byte order and the integral flag as given in the header.
Typically, non-integral with 8 bytes means IEEE 754 64-bit double.
Deserializing bytes to double involves extracting the bits for the mantissa and exponent, and combining them with arithmetic operations. Perhaps you want that as a challenge and to start from a description of the standard: What Every Computer Scientist Should Know About Floating-Point Arithmetic, "Formats and Operations" section.

ASCII Representation of Hexadecimal

I have a string that, by using string.format("%02X", char), I've received the following:
74657874000000EDD37001000300
In the end, I'd like that string to look like the following:
t e x t NUL NUL NUL í Ó p SOH NUL ETX NUL (spaces are there just for clarification of characters desired in example).
I've tried to use \x..(hex#), string.char(0x..(hex#)) (where (hex#) is alphanumeric representation of my desired character) and I am still having issues with getting the result I'm looking for. After reading another thread about this topic: what is the way to represent a unichar in lua and the links provided in the answers, I am not fully understanding what I need to do in my final code that is acceptable for this to work.
I'm looking for some help in better understanding an approach that would help me to achieve my desired result provided below.
ETA:
Well I thought that I had fixed it with the following code:
function hexToAscii(input)
local convString = ""
for char in input:gmatch("(..)") do
convString = convString..(string.char("0x"..char))
end
return convString
end
It appeared to work, but didnt think about characters above 127. Rookie mistake. Now I'm unsure how I can get the additional characters up to 256 display their ASCII values.
I did the following to check since I couldn't truly "see" them in the file.
function asciiSub(input)
input = input:gsub(string.char(0x00), "<NUL>") -- suggested by a coworker
print(input)
end
I did a few gsub strings to substitute in other characters and my file comes back with the replacement strings. But when I ran into characters in the extended ASCII table, it got all forgotten.
Can anyone assist me in understanding a fix or new approach to this problem? As I've stated before, I read other topics on this and am still confused as to the best approach towards this issue.
The simple way to transform a base16-encoded string is just to
function unhex( input )
return (input:gsub( "..", function(c)
return string.char( tonumber( c, 16 ) )
end))
end
This is basically what you have, just a bit cleaner. (There's no need to say "(..)", ".." is enough – if you specify no captures, you'll automatically get the whole match. And while it might work if you write string.char( "0x"..c ), it's just evil – you concatenate lots of strings and then trigger the automatic conversion to numbers. Much better to just specify the base when explicitly converting.)
The resulting string should be exactly what went into the hex-dumper, no matter the encoding.
If you cannot correctly display the result, your viewer will also be unable to display the original input. If you used different viewers for the original input and the resulting output (e.g. a text editor and a terminal), try writing the output to a file instead and looking at it with the same viewer you used for the original input, then the two should be exactly the same.
Getting viewers that assume different encodings (e.g. one of the "old" 8-bit code pages or one of the many versions of Unicode) to display the same thing will require conversion between different formats, which tends to be quite complicated or even impossible. As you did not mention what encodings are involved (nor any other information like OS or programs used that might hint at the likely encodings), this could be just about anything, so it's impossible to say anything more specific on that.
You actually have a couple of problems:
First, make sure you know the meaning of the term character encoding, and that you know the difference between characters and bytes. A popular post on the topic is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Then, what encoding was used for the bytes you just received? You need to know this, otherwise you don't know what byte 234 means. For example it could be ISO-8859-1, in which case it is U+00EA, the character ê.
The characters 0 to 31 are control characters (eg. 0 is NUL). Use a lookup table for these.
Then, displaying the characters on the terminal is the hard part. There is no platform-independent way to display ê on the terminal. It may well be impossible with the standard print function. If you can't figure this step out you can search for a question dealing specifically with how to print Unicode text from Lua.

Huffman code for a single character?

Lets say I have a massive string of just a single character say x. I need to use huffman encoding.
A huffman encoding is a fully binary tree. So how does one create a huffman code for just a single character when we dont need two leaves at all ?
jbr's answer is fine; this is just a longer version of it.
Huffman is meant to produce a minimal-length sequence of bits that contains all the information in the original sequence of symbols, assuming that the decoder already knows the set of symbols. If there's only one symbol, the input data contains no information except its length.
In Huffman-based data formats, length is usually encoded separately, not as part of the Huffman-encoded bit sequence itself. The decoder of a single-symbol Huffman code therefore has all the information it needs to reconstruct the input without needing to read anything from the Huffman-encoded bit sequence. it is logical, then, that the Huffman encoder's output should be 0 bits long.
If you don't have a length encoded separately, then you must have a symbol to represent End Of Sequence so the decoder knows when to stop reading. Then your Huffman tree will have 2 nodes and you won't run into this special case.
If you only have one symbol, then you only need 1 bit per symbol. So you really don't have to do anything except count the number of bits and translate each into your symbol.
You simply could add an edge case in your code.
For example:
check if there is only one character in your hash table, which returns only the root of the tree without any leafs. In this case, you could add a code for this root node in your encoding function, like 0.
In the encoding function, you should refer to this edge case too.

Delphi - Problem With Set String and PAnsiChar and Other Strings not Displaying

I was getting advice from Rob Kennedy and one of his suggestions that greatly increased the speed of an app I was working on was to use SetString and then load it into the VCL component that displayed it.
I'm using Delphi 2009 so now that PChar is Unicode,
SetString(OutputString, PChar(Output), OutputLength.Value);
edtString.Text := edtString.Text + OutputString;
Works and I changed it to PChar myself but since the data being moved isn't always Unicode in fact its usually ShortString Data.... so onto what he actually gave me to use:
SetString(OutputString, PAnsiChar(Output), OutputLength.Value);
edtString.Text := edtString.Text + OutputString;
Nothing shows up but I check in the debugger and the text that normally appeared the way I did it building 1 char at a time in the past was in the variable.
Oddly enough this is not the first time I ran into this tonight. Because I was trying to come up with another way, I took part of his advice and instead of building into a VCL's TCaption I built it into a string variable and then copied it, but when I send it over nothing's displayed. Once again in the debugger the variable that the data is built in... has the data.
for I := 0 to OutputLength.Value - 1 do
begin
OutputString := OutputString + Char(OutputData^[I]);
end;
edtString.Text := OutputString;
The above does not work but the old slow way of doing it worked just fine....
for I := 0 to OutputLength.Value - 1 do
begin
edtString.Text := edtString.Text + Char(OutputData^[I]);
end;
I tried making the variable a ShortString, String and TCaption and nothing is displayed. What I also find interesting is while I build my hex data from the same array into a richedit it's very fast while doing it inside an edit for the text data is very very slow. Which is why I haven't bothered trying to change the code for the richedit as it works superfast as it is.
Edit to add - I think I sorta found the problem but I have no solution. If I edit the value in the debugger to remove anything that can't be displayed (which by the old method used to just not display... not fail) then what I have left is displayed. So if it's just a matter of getting rid of bytes that were turned to characters that are garbage how can I fix that?
I basically have incoming raw data from a SCSI device that's being displayed hex-editor style. My original slow style of adding one char at a time successfully displayed strings and Unicode strings that did not have Unicode-specific characters in them. The faster methods even if working won't display ShortStrings one way and the other way wont display UnicodeStrings that aren't using non 0-255 characters. I really like and could use the speed boost but if it means sacrificing the ability to read the string... then what's the point in the app?
EDIT3 - Alright now that I've figured out that 0-31 is control char and 32 and up is valid I think I'm gonna make an attempt to filter char and replace those not valid with a . which is something I was planning on doing later to emulate hex editor style.
If there are any other suggestions I'd be glad to hear about them but otherwise I think I can craft a solution that's faster than the original and does what I need it to at the same time.
Some comments:
Your question is very unclear. What exactly do you want to do?
Your question reads terrible, please check your text with a spelling checker.
The question you are referring to is this one: Delphi accessing data from dynamic array that is populated from an untyped pointer
Please give a complete code sample of your function like you did in your previous question, I want to know if you implemented Rob Kennedy's suggestion or the code you gave yourself in a following answer (let's hope not :) )
As far a I understand your question: You're sending a query to your SCSI device and you get an array of bytes back which you store in the variable OutputData. After that you want to show your data to the user. So your real question is: How to show an array of bytes to the user?
Login as the same user and don't create an account for every new question. That way we can track your question history an can find out what you mean by 'getting advice'.
Some assumptions and suggestions if I'm right about the true meaning of your question:
Showing your data as a hex string won't give any problems
Showing your data in a normal Memo field gives you problems, although a Delphi string can contain any character, including 0 bytes, displaying them will give you problems. A TMemo for example will show your data till the first 0 byte. What you have to do (and you gave the answer yourself), is replacing non viewable characters with a dummy. After that you can show your data in a TMemo. Actually all hex viewers do the same, characters that cannot be printed will be shown as a dot.
I used PAnsiChar in my example for a reason. It looked like OutputLength was being measured in bytes, not characters, so I made sure to use a type whose length is always measured in bytes. You'll also notice that I showed the declaration of OutputString as an AnsiString.
Since the edit control stored Unicode, though, there will be a conversion between AnsiString and UnicodeString. That will take the system's current code page into account, but that's probably not what you want. You might want to declare the variable as a RawByteString instead. That won't have any code page associated with it, so there won't be any unexpected conversions.
Don't use strings for storing binary data. If you're building what amounts to a hex editor, then you're working with binary data. It's important to remember that. Even if your binary data happens to consist mostly of bytes that can be interpreted as text, you can't treat the data as text or you'll run into exactly the problems you're seeing — characters that don't appear as expected. If you get a bunch of bytes from your SCSI device, then store them in an array of bytes, not characters.
In hex editors, you'll notice that they always show the hexadecimal values of the bytes. They might show those bytes interpreted as characters, but that's secondary, and they generally only show the bytes that can represent ASCII characters; they don't try to get too fancy with the basic display. The good hex editors will offer to display the data interpreted as wide characters, too. This aids in debugging because the user can look at the same data in multiple ways. But they're just views of the data. They're not actually changing the binary contents of the data.
When you filter out non viewable characters...You'll probably need to decide what to do with a couple of them like #9(Tab), #10(LF), #11(Verticle Tab), #12(FF-or New Page),#13(CR)

Resources