a GNU program that computes MD5 sums for arbitrary files (read md5 hash function to understand what we are talking about). Present in many Linux distributions, it conforms to RFC 1321. Checksums are expressed as sequences of letters and numbers, which means that they can be transmissed through email without encoding.
The program can run basically in 2 modes. The first (default) mode is checksum generation. A practical application:
-rw-rw-r-- 1 baffo baffo 92738 Aug 3 2000 bff30.jpg
-rw-rw-r-- 1 baffo baffo 92738 Aug 3 2000 swizzler30.jpg
Feh ! Two files with the same size and the same date. Maybe they contain exactly the same data. Let us see:
bash# md5sum swizzler30.jpg bff30.jpg
Since they have the same checksum, we can safely assume that they are the same file.
If you save the generated checksums in a file
bash$ md5sum abd* >checksum.md5
bash$ more checksum.md5
You will later be able to run the program with the --check or -c switch. This will tell you if the files have changed, even by one single bit.
bash$ md5sum -c checksum.md5
The basic property of the MD5 hash is precisely that a minute change of the input file will produce a wildly different result.
It is very difficult (read: a computational nightmare) to generate a file that has an arbitrary MD5 hash. This holds true although there are some known MD5 collisions, that you can read about in ariels' md5 hash function.
This is why you can also use the md5sum program to check if a file has been transfered without any tampering or corruption.
The fact that MD5 hashing has collisions should not come as a surprise, of course - after all it is a hash function.