What is the difference between a Binary and a Bitstring in Erlang? - erlang

In the Erlang shell, I can do the following:
A = 300.
300
<<A:32>>.
<<0, 0, 1, 44>>
But when I try the following:
B = term_to_binary({300}).
<<131,104,1,98,0,0,1,44>>
<<B:32>>
** exception error: bad argument
<<B:64>>
** exception error: bad argument
In the first case, I'm taking an integer and using the bitstring syntax to put it into a 32-bit field. That works as expected. In the second case, I'm using the term_to_binary BIF to turn the tuple into a binary, from which I attempt to unpack certain bits using the bitstring syntax. Why does the first example work, but the second example fail? It seems like they're both doing very similar things.

The difference between in a binary and a bitstring is that the length of a binary is evenly divisible by 8, i.e. it contains no 'partial' bytes; a bitstring has no such restriction.
This difference is not your problem here.
The problem you're facing is that your syntax is wrong. If you would like to extract the first 32 bits from the binary, you need to write a complete matching statement - something like this:
<<B1:32, _/binary>> = B.
Note that the /binary is important, as it will match the remnant of the binary regardless of its length. If omitted, the matched length defaults to 8 (i.e. one byte).
You can read more about binaries and working with them in the Erlang Reference Manual's section on bit syntax.
EDIT
To your comment, <<A:32>> isn't just for integers, it's for values. Per the link I gave, the bit syntax allows you to specify many aspects of binary matching, including data types of bound variables - while the default type is integer, you can also say float or binary (among others). The :32 part indicates that 32 bits are required for a match - that may or may not be meaningful depending on your data type, but that doesn't mean it's only valid for integers. You could, for example, say <<Bits:10/bitstring>> to describe a 10-bit bitstring. Hope that helps!

The <<A:32>> syntax constructs a binary. To deconstruct a binary, you need to use it as a pattern, instead of using it as an expression.
A = 300.
% Converts a number to a binary.
B = <<A:32>>.
% Converts a binary to a number.
<<A:32>> = B.

Related

How do I keep my rails integer from being converted to binary?

As you may be able to see in the image, I have a User model and #user.zip is stored as an integer for validation purposes (ie, so only digits are stored, etc.). I was troubleshooting an error when I discovered that my sample zip code (00100) was automatically being converted to binary, and ending up as the number 64.
Any ideas on how to keep this from happening? I am new to Rails, and it took me a few hours to figure out the cause of this error, as you might imagine :)
I can't imagine any other information would be helpful here, but please inform me if otherwise.
This is not binary, this is octal.
In Ruby, any number starting with 0 will be treated as an octal number. You should check the Ruby number literals to learn more about this, here's a quote:
You can use a special prefix to write numbers in decimal, hexadecimal, octal or binary formats. For decimal numbers use a prefix of 0d, for hexadecimal numbers use a prefix of 0x, for octal numbers use a prefix of 0 or 0o, for binary numbers use a prefix of 0b. The alphabetic component of the number is not case-sensitive.
For your case, you should not store zipcodes as numbers. Not only in the database, but even as variables don't treat them as numeric values. Instead, store and treat them as strings.
The zip should probably be stored as a string since you can't have a valid integer with leading zeroes.

How are value equality comparisons executed in a program?

How does the computer execute value equality comparisons? Does it compare the values bit by bit, starting from the smallest bit, and stop once two different bits are encountered? Or does it start from the highest bit? Does it go through all bits regardless of where/when two unlike bits are found?
When you write an equality comparison in a higher level language (e.g. c), it will be transformed to intermediary representation and then to the instructions of a particular platform this code will be executed upon. Compiler is free to implement the equality comparison using any of the instructions available on a target architecture. The idea is usually to make it faster.
Different architectures have different instruction sets. Different processors can have varying implementation strategies (again to make things faster), as long as they comply with the spec.
Below are a few examples
x86
CMP command is used to compare two values. Here's an excerpt from Instruction set reference.
Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results. The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand.
This basically means all bits are examined. I guess it was implemented that way to allow non-equality(<,>) comparisons too.
So all bits are examined. In the simplest cases that can be done serially, but can be done faster. See wikibooks on add / subtract blocks.
ARM
TEQ command can be used to test two values for equality.
Here's an excerpt from the infocenter.arm.com
The TEQ instruction performs a bitwise Exclusive OR operation on the value in Rn and the value of Operand2. This is the same as the EORS instruction, except that it discards the result.
Use the TEQ instruction to test if two values are equal without affecting the V or C flags.
Again all bits are examined.

Lua bit library

Right now I have made my own funcs to do bitwise and + not but then I saw the bit library and tried to use it but it doesn't work how I imagined, it returns a large decimal instead of the binary bit and so my question is actually a few.
First: how to do bitwise AND on binary number using the bit32 library.
10110111
11000100 = 10000100
Second: How to calculate the ipv4 broadcast address by adding the network address and the wildcard mask in binary form using the bit32 library
192.168.1.0 + 31 = 192.168.1.31
11000000.10100000.00000001.00000000
00000000.00000000.00000000.00011111 = 11000000.10100000.00000001.00011111
I am assuming that your bitwise and / not functions take string arguments.
Numbers can be represented in multiple ways.
The number 110101, which is in base two, has the same value as 53, which is in base 10.
When you say
x=123
Lua converts 123 into its binary representation, 1111011, which it then stores in memory as bits.
When you say
print(x)
Lua goes into memory, grabs x, which is 1111011, and then converts it into its more human-readable base 10 representation, and you see
123
The bitwise functions you wrote performs bit operations on strings which display the binary representation of a number like "1111011". the bit32 library performs bit operations on numbers, which display the decimal representation of a number like 123.
In Lua, "1001001" is a string, but if arithmatic operations are performed on it, it treats it as if it were a number written in base 10. So when you do
bit32.band("101","110")
the bit32.band function interprets its arguments as one-hundred-one and one-hundred-ten.
You must first convert your binary strings into numbers:
bit32.band(tonumber("101",2), tonumber("110",2))

Huffman code for a single character?

Lets say I have a massive string of just a single character say x. I need to use huffman encoding.
A huffman encoding is a fully binary tree. So how does one create a huffman code for just a single character when we dont need two leaves at all ?
jbr's answer is fine; this is just a longer version of it.
Huffman is meant to produce a minimal-length sequence of bits that contains all the information in the original sequence of symbols, assuming that the decoder already knows the set of symbols. If there's only one symbol, the input data contains no information except its length.
In Huffman-based data formats, length is usually encoded separately, not as part of the Huffman-encoded bit sequence itself. The decoder of a single-symbol Huffman code therefore has all the information it needs to reconstruct the input without needing to read anything from the Huffman-encoded bit sequence. it is logical, then, that the Huffman encoder's output should be 0 bits long.
If you don't have a length encoded separately, then you must have a symbol to represent End Of Sequence so the decoder knows when to stop reading. Then your Huffman tree will have 2 nodes and you won't run into this special case.
If you only have one symbol, then you only need 1 bit per symbol. So you really don't have to do anything except count the number of bits and translate each into your symbol.
You simply could add an edge case in your code.
For example:
check if there is only one character in your hash table, which returns only the root of the tree without any leafs. In this case, you could add a code for this root node in your encoding function, like 0.
In the encoding function, you should refer to this edge case too.

What is the difference between Delphi string comparsion functions?

There's a bunch of ways you can compare strings in modern Delphi (say 2010-XE3):
'<=' operator which resolves to UStrCmp / LStrCmp
CompareStr
AnsiCompareStr
Can someone give (or point to) a description of what those methods do, in principle?
So far I've figured that AnsiCompareStr calls CompareString on Windows, which is a "textual" comparison (i.e. takes into account unicode combined characters etc). Simple CompareStr does not do that and seems to do a binary comparison instead.
But what is the difference between CompareStr and UStrCmp? Between UStrCmp and LStrCmp? Do they all produce identical results? Do those results change between versions of Delphi?
I'm asking because I need a comparison which will always produce the same results, so that indexes in app built with one version of Delphi remain consistent with code built with another.
AnsiCompareStr is specified as taking locale into account, and should return identical results regardless of Delphi version, but may return different results based on Windows version and/or settings.. CompareStr is a pure binary comparison: "The comparison operation is based on the 16-bit ordinal value of each character and is not affected by the current locale" (for the CompareStr(const S1, S2: string) overload). UStrCmp also uses a pure binary comparison: "Strings are compared according to the ordinal values that make up the characters that make up the string." So there should not be a difference between the latter two. The way they return the result is different, so two implementations are needed (although it would be possible to make one rely on the other).
As for the differences between LStrCmp and UStrCmp, LStrCmp takes AnsiStrings, UStrCmp takes UnicodeStrings. It's entirely possible that two characters (let's say A and B) are ordered in the misnamed "ANSI" code page as A < B, but are ordered in Unicode as A > B. You should almost always just use the comparison appropriate for the data you have.

Resources