No account? Create an account
 Adventures in EngineeringThe wanderings of a modern ronin.

 Date: 2006-06-16 20:19 Subject: Today we learn about CRCs. Public
A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum (a small, fixed number of bits) against a block of data, such as a packet of network traffic or a block of a computer file. The checksum is used to detect errors after transmission or storage. A CRC is computed and appended before transmission or storage, and verified afterwards by recipient to confirm that no changes occurred on transit. CRCs are popular because they are simple to implement in binary hardware, are easy to analyze mathematically, and are particularly good at detecting common errors caused by noise in transmission channels.

http://en.wikipedia.org/wiki/Cyclic_redundancy_check

The Wikipedia article doesn't lack for technical correctness, but until I go back to school and get that graduate level math degree, I think my understanding of "division in the ring of polynomials over the finite field GF(2)" is going to be hazy at best.

A better resource for learning about CRCs is Ross Williams' "Painless Guide To CRC Error Detection Algorithms":

The basic idea of CRC algorithms is simply to treat the message as an enormous binary number, to divide it by another fixed binary number, and to make the remainder from this division the checksum. Upon receipt of the message, the receiver can perform the same division and compare the remainder with the "checksum" (transmitted remainder). The quotient, or normal result of the division, is simply thrown away.

The word you will hear all the time when dealing with CRC algorithms is the word "polynomial". A given CRC algorithm will be said to be using a particular polynomial, and CRC algorithms in general are said to be operating using polynomial arithmetic. What does this mean?

Instead of the divisor, dividend (message), quotient, and remainder (checksum) being viewed as positive integers, they are viewed as polynomials with binary coefficients. This is done by treating each number as a bit-string whose bits are the coefficients of a polynomial.

Mathematicians came up with all sorts of different kinds of polynomial arithmetics simply by changing the rules about how coefficients work. Of these schemes, one in particular is relevant here, and that is a polynomial arithmetic where the coefficients are calculated MOD 2 and there is no carry; all coefficients must be either 0 or 1 and no carries are calculated. This is called "polynomial arithmetic mod 2".

Polynomical arithmetic mod 2 is just binary arithmetic mod 2 with no carries. While polynomials provide useful mathematical machinery in more analytical approaches to CRC and error-correction algorithms, for the purposes of exposition they provide no extra insight and some encumbrance and have been discarded in the remainder of this document in favour of direct manipulation of the arithmetical system with which they are isomorphic: binary arithmetic with no carry.

To perform a CRC calculation, we need to choose a divisor. In maths marketing speak the divisor is called the "generator polynomial" or simply the "polynomial", and is a key parameter of any CRC algorithm. It would probably be more friendly to call the divisor something else, but the poly talk is so deeply ingrained in the field that it would now be confusing to avoid it. As a compromise, we will refer to the CRC polynomial as the "poly". You can choose any poly and come up with a CRC algorithm. However, some polys are better than others, and so it is wise to stick with the tried an tested ones. Also, the degree of the polynomial is an important parameter, known as the "width", W.

Having chosen a poly, we can proceed with the calculation. This is simply a division (in CRC arithmetic) of the message by the poly. The only trick is that W zero bits are appended to the message before the CRC is calculated.

The division yields a quotient, which we throw away, and a remainder, which is the calculated checksum. This ends the calculation. Usually, the checksum is then appended to the message and the result transmitted.

At the other end, the receiver can do one of two things:

1. Separate the message and checksum. Calculate the checksum for the message (after appending W zeros) and compare the two checksums.
2. Checksum the whole lot (without appending zeros) and see if it comes out as zero!

These two options are equivalent. However, in the next section, we will be assuming option b because it is marginally mathematically cleaner.

A summary of the operation of the class of CRC algorithms:

1. Choose divisor polynomial, G, of width W.
2. Append W zero bits to the message. Call this M'.
3. Divide M' by G, using CRC arithmetic. The remainder is the checksum.

That's all there is to it.

There's also an excellent section on optimizing CRC implementations.

browse