A few questions about CRC basics - checksum

I am an electronic engineer and have not found it important to consider CRC from a purely mathematical perspective. However, I have the following questions:
Why do we add n zeros to the message when we calculate the CRC, were n is the degree of the generator polynomial? I have seen this in the modulo-2 long division as well as the hardware implementation of CRC
Why do we want that the generator polynomial be divisible by (x+1)?
Why do we want that the generator polynomial not be divisible by x?

We add n zeros when computing an n-bit CRC because, when appending the CRC to the message and sending the whole (a usual practice in telecoms):
That allows the receiving side to process the bits of the CRC just as the rest of the message is, leading to a known remainder for any error-free transmission. This is especially useful when the end of the message is indicated by something that follows the CRC (a common practice); on the receiving side it saves an n bit buffer, and on the transmit side it adds virtually no complexity (the extra terms of x(n) reduce to an AND gate forcing message bits to zero during CRC transmission, and the n extra reduction steps are performed as the CRC is transmitted).
Mathematically, the CRC sent is (M(x) * x^n) mod P(x) = R(x) (perhaps, within some constant, or/and perhaps with some prescribed bits added at beginning of M(x), corresponding to an initialization of the CRC register), and the CRC computed on the receiving side is over the concatenation of M(x) and R(x), that is
(M(x) * x^n + R(x)) mod P(x), which is zero (or said constant).
It insures that burst of errors affecting both the end of the message and the contiguous CRC benefit from the full level of protection afforded by the choice of polynomial. In particular, if we computed C(x) as M(x) mod P(x), flipping the last bit of M(x) and the last bit of C(x) would go undetected, when most polynomials used in error detection insure that any two-bit error is detected up to some large message size.
It is common practice to have CRC polynomials used for error detection divisible by x+1, because it ensures that any error affecting an odd number of bits is detected. However that practice is not universal, and it would sometime prevents selection of a better polynomial for some useful definitions of better, including maximizing the length of message such that m errors are always detected (assuming no synchronization loss), for some combinations of m and n. In particular, if we want to be able to detect any 2-bit error for the longest message possible (which will be 2n-1 bits including n-bit CRC), we need the polynomial to be primitive, thus irreducible, thus (for n>1) not divisible by x+1.
It is universal practice to have CRC polynomials used for error detection not divisible by x, because otherwise the last bit of CRC generated would be constant, and would not help towards detection of errors in the rest of the message+CRC.

Related

16 bit Checksum fuzzy analsysis - Leveraging "collisions", biases a thing?

If playing around with CRC RevEng fails, what next? That is the gist of my question. I am trying to learn more how to think for myself, not just looking for an answer 1 time to 1 problem.
Assuming the following:
1.) You have full control of white box algorithm and can create as many chosen sample messages as you want with valid 16 bit / 2 byte checksums
2.) You can verify as many messages as you want to see if they are valid or not
3.) Static or dynamic analysis of the white box code is off limits (say the MCU is of a lithography that would require electron microscope to analyze for example, not impossible but off limits for our purposes).
Can you use any of these methods or lines of thinking:
1.) Inspect "collisions", i.e. different messages with same checksum. Perhaps XOR these messages together and reveal something?
2.) Leverage strong biases towards certain checksums?
3.) Leverage "Rolling over" of the checksum "keyspace", i.e. every 65535 sequentially incremented messages you will see some type of sequential patterns?
4.) AI ?
Perhaps there are other strategies I am missing?
CRC RevEng tool was not able to find the answer using numerous settings configurations
Key properties and attacks:
If you have two messages + CRCs of the same length and exclusive-or them together, the result is one message and one pure CRC on that message, where "pure" means a CRC definition with a zero initial value and zero final exclusive-or. This helps take those two parameters out of the equation, which can be solved for later. You do not need to know where the CRC is, how long it is, or which bits of the message are participating in the CRC calculation. This linear property holds.
Working with purified examples from #1, if you take any two equal-length messages + pure CRCs and exclusive-or them together, you will get another valid message + pure CRC. Using this fact, you can use Gauss-Jordan elimination (over GF(2)) to see how each bit in the message affects other generated bits in the message. With this you can find out a) which bits in the message are particiapting, b) which bits are likely to be the CRC (though it is possible that other bits could be a different function of the input, which can be resolved by the next point), and c) verify that the check bits are each indeed a linear function of the input bits over GF(2). This can also tell you that you don't have a CRC to begin with, if there don't appear to be any bits that are linear functions of the input bits. If it is a CRC, this can give you a good indication of the length, assuming you have a contiguous set of linearly dependent bits.
Assuming that you are dealing with a CRC, you can now take the input bits and the output bits for several examples and try to solve for the polynomial given different assumptions for the ordering of the input bits (reflected or not, perhaps by bytes or other units) and the direction of the CRC shifting (reflected or not). Since you're talking about an allegedly 16-bit CRC, this can be done most easily by brute force, trying all 32,768 possible polynomials for each set of bit ordering assumptions. You can even use brute force on a 32-bit CRC. For larger CRCs, e.g. 64 bits, you would need to use Berlekamp's algorithm for factoring polynomials over finite fields in order to solve the problem before the heat death of the universe. Then you would factor each message + pure CRC as a polynomial, and look for common factors over multiple examples.
Now that you have the message bits, CRC bits, bit orderings, and the polynomial, you can go back to your original non-pure messages + CRCs, and solve for the initial value and final exclusive-or. All you need are two examples with different lengths. Then it's a simple two-equations-in-two-unknowns over GF(2).
Enjoy!

Bit encoding for vector of rational numbers

I would like to implement ultra compact storage for structures with rational numbers.
In the book "Theory of Linear and Integer Programming" by Alexander Schrijver, I found the definition of bit sizes (page. 15) of rational number, vector and matrix:
The representation of rational number is clear: single bit for sign and logarithm for quotient and fraction.
I can't figure out how vector can be encoded only in n bits to distinguish between its elements?
For example what if I would like to write vector of two elements:
524 = 1000001100b, 42 = 101010b. How can I use only 2 additional bits to specify when 1000001100 ends and 101010 starts?
The same problem exists with matrix representation.
Of course, it is not possible just to append the integer representations to each other, and add the information about the merging place, since this would take much more bits than given by the formula in the book, which I don't have access to.
I believe this is a problem from coding theory where I am not an expert. But I found something that might point you to the right direction. In this post an "interpolative code" is described among others. If you apply it to your example (524, 42), you get
f (the number of integers to be encoded, all in the range [1,N] = 2
N = 524
The maximum bit length of the encoded 2 integers is then
f • (2.58 + log (N/f)) = 9,99…, i.e. 10 bits
Thus, it is possible to have ultra compact encoding, although one had to spend a lot of time for coding and decoding.
It is impossible to use only two bits to specify when the quotient end and fraction start. At least you will need as big as the length of the quotient or/and the length of the fraction size. Another way is to use a fixed number of bits for both quotient and fraction similar with IEEE 754.

Which one is the better CRC scheme?

Say I have to error-check a message of some 120-bits long.I have two alternative for checksum schemes:
Split message to 5 24-bit strings and append each with a CRC8 field
Append the whole message with a CRC32 field
Which scheme has a higher error detection probability, and why? Let's assume no prior knowledge about the error patterns distribution.
UPDATE:
What if the system has a natural mode of failure which is a received cleared bit instead of a set bit (i.e., "1" was Tx-ed but "0" was Rx-ed), and the opposite does not happen?
In this case, the probability of long bursts of error bits is much smaller, assuming that the valid data has a uniform distribution of "0"s and "1"s, so the longest burst will be bound by the longest string of "1"s in the message.
You have to make some assumption about the error patterns. If you have a uniform distribution over all possible errors, then five 8-bit CRCs will detect more of the errors than one 32-bit CRC, simply because the former has 40 bits of redundancy.
However, I can construct many 24-bit error patterns that fool an 8-bit CRC, and use any combination of five of those to get no errors over all of the 8-bit CRCs. Yet almost all of those will be caught by the 32-bit CRC.
A good paper by Philip Koopman goes through evaluation of several CRCs, mostly focusing on their Hamming Distance. Like Mark Adler pointed out, the error distribution plays an important role in CRC selection (e.g. burst errors detection is one of the variable properties of CRC), as is the length of the CRC'ed data.
The Hamming Distance of a CRC indicates the maximum number of errors in the data which are 100% detectable.
Ref:
Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks:
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.5.5027&rep=rep1&type=pdf
8-bit vs 32-bit CRC
For exemple, the 0x97 8-bit CRC polynomial has HD=4 up to 119 bits data words (which is more than your required 24-bit word), which means it detects 100% of 4 bits (or less) errors for data of length 119 bits or less.
On the 32-bit side, the 32-bit CRC 0x9d7f97d6 offer HD=9 up to 223 bits (greater than 5*24=120bits) data words. This means that it will detect 100% of the 9 bits (or less) errors for data composed of 223 bits or less.
Theoretically, 5x 8-bit CRCs would be able to 100% detect 4*4=16 evenly distributed bit flips across your 5 chunks (4 errors per 24-bit chunk). On the other end, the 32-bit CRC would only be able to 100% detect 9 bit flips per 120-bit chunk.
Error Distribution
Knowing all that, the only missing piece is the error distribution pattern. With it in hand, you'll be able to make an informed decision on the best CRC method to use. You seem to say that long burst of errors are not possible, but do not mention the exact maximum length. If that length does go up to 9 bits, you might be better off with the CRC32. If you expect occasional, <4-bit errors, both would do, though the 5x8-bit will consume more bandwidth (40 bits instead of 32 bits). If this is the case, a 32-bit CRC might even be overkill, a smaller CRC16 or even CRC9 could provide enough detection capabilities.
Beyond the hamming window, the CRC will not be able to catch every possible errors. The bigger the data length, the worse the CRC performances.
The CRC32 of course. It will detect ordering errors as between the five segments, as well as giving you 224 as much error detection.

Why is modulus operator slow?

Paraphrasing from in "Programming Pearls" book (about c language on older machines, since book is from the late 90's):
Integer arithmetic operations (+, -, *) can take around 10 nano seconds whereas the % operator takes up to 100 nano seconds.
Why there is that much difference?
How does a modulus operator work internally?
Is it same as division (/) in terms of time?
The modulus/modulo operation is usually understood as the integer equivalent of the remainder operation - a side effect or counterpart to division.
Except for some degenerate cases (where the divisor is a power of the operating base - i.e. a power of 2 for most number formats) this is just as expensive as integer division!
So the question is really, why is integer division so expensive?
I don't have the time or expertise to analyze this mathematically, so I'm going to appeal to grade school maths:
Consider the number of lines of working out in the notebook (not including the inputs) required for:
Equality: (Boolean operations) essentially none - in computer "big O" terms this is known a O(1)
addition: two, working left to right, one line for the output and one line for the carry. This is an O(N) operation
long multiplication: n*(n+1) + 2: two lines for each of the digit products (one for total, one for carry) plus a final total and carry. So O(N^2) but with a fixed N (32 or 64), and it can be pipelined in silicon to less than that
long division: unknown, depends upon the argument size - it's a recursive descent and some instances descend faster than others (1,000,000 / 500,000 requires less lines than 1,000 / 7). Also each step is essentially a series of multiplications to isolate the closest factors. (Although multiple algorithms exist). Feels like an O(N^3) with variable N
So in simple terms, this should give you a feel for why division and hence modulo is slower: computers still have to do long division in the same stepwise fashion tha you did in grade school.
If this makes no sense to you; you may have been brought up on school math a little more modern than mine (30+ years ago).
The Order/Big O notation used above as O(something) expresses the complexity of a computation in terms of the size of its inputs, and expresses a fact about its execution time. http://en.m.wikipedia.org/wiki/Big_O_notation
O(1) executes in constant (but possibly large) time. O(N) takes as much time as the size of its data-so if the data is 32 bits it takes 32 times the O(1) time of the step to calculate one of its N steps, and O(N^2) takes N times N (N squared) the time of its N steps (or possibly N times MN for some constant M). Etc.
In the above working I have used O(N) rather than O(N^2) for addition since the 32 or 64 bits of the first input are calculated in parallel by the CPU. In a hypothetical 1 bit machine a 32 bit addition operation would be O(32^2) and change. The same order reduction applies to the other operations too.

Does CRC has the following feature

When the data transmission is tampered 1 bit or 2 bits, can the receiver correct it automatically?
No, CRC is an error-detecting code, not an error-correcting code.
Read more here
CRC is primarily used as an error-detecting code. If the total number of bits (including those in the CRC) is smaller than the CRC's period, though, it is possible to correct single-bit errors by computing the syndrome (xor the calculated and received CRC's). Each bit will, if flipped individually, generate a unique syndrome. One can iterate the CRC algorithm to find the syndrome that would be associated with each bit; if one finds the syndrome associated with each bit, one can flip it and correct a single-bit error.
One major danger with doing this, though, is that the CRC will be much less useful for rejecting bogus data. If one uses an 8-bit CRC on a packet with 15 bytes of data, only one in 256 random packets would pass validity, but half of all random packets could be "corrected" by flipping a single bit.

Resources