Feeds:
Posts
Comments

Posts Tagged ‘Verification’

If you’re unfamiliar with MD5 checksums (or MD5sums for short), they are simply a string of numbers (hashes) generated when each file is scanned, to be used to later verify the integrity of the data. You may have noted when downloading Linux .iso images or similar that there either was a text file with it – usually with the same suffix as the main file, but with perhaps an .md5sums extension – or the actual hash below the download link.

Also, you probably know you can check your Ubuntu CD for defects while at the boot menu, but since that just looks to an md5sum.txt file (most common name on Linux live CDs), you can also do so in Ubuntu via the terminal. So, for example, if you’ve burned a copy of the latest Ubuntu (or other Linux distro) live CD for a friend, you can simply open a terminal and check it without having to reboot.

But the most important use of the md5sum command is to create data verification for folders on your drive, as well as data CDs and DVDs, and even video DVDs. If you just wanted to periodically make sure no files are corrupt in a given folder (or whole drive if you want), this is the way to go. If you have a whole bunch of things in a folder you want to burn to a data disc, then the checksum file you create will let you check the disc for defects.

θθθθθθθθθθθθθθθθθθθθθ

So when you want to create the checksums, open a terminal in that folder and enter the following:

find -type f -exec md5sum "{}" \; > md5sum.txt

Note that this will also create a hash for the file itself, ie:  md5sum.txt, which will produce an error when checked, since it was generated while the file was still being created:

md5sum: WARNING: 1 of 103 computed checksums did NOT match

When you scroll up the terminal to see the cause of the error, you’ll find:

./md5sum.txt: FAILED

You will need to manually edit out the line for md5sum.txt, and if the file is really large, just hit Ctrl+F and search for md5, and it will take you to the line you need to delete.

OR:

To avoid md5sum.txt being added to the checksums altogether, run the following instead:

find -type f -exec md5sum "{}" \; | sed '/md5sum.txt/d' > md5sum.txt

Note that not only the md5sum.txt currently being generated will be left out, but any other files of the same name that already exist in other folders being checked. If you want to include all the other md5sum.txt files, run the first command instead, and just edit out the reference to the one that was generated in the root folder.

Once that’s done, you can verify the folder/drive any time you wish. With discs, it isn’t limited to data, or rather since the .vob files etc of a DVD are data, you can generate the md5sum.txt in the parent folder of the title (ie: the one VIDEO_TS resides in) and check movies as well as data backups.

θθθθθθθθθθθθθθθθθθθθθ

To check a folder, open a terminal there and enter:

md5sum -c md5sum.txt

To check a disc that has that file, including the likes of the Ubuntu CD, you’ll need the terminal pointing at the disc. But rather than open a folder window and choose Open in Terminal from the context menu, you can do that via any open terminal and incorporate the checking command above with:

cd /media/cdrom0 && md5sum -c md5sum.txt

Occasionally systems don’t have cdrom0 as the device name for the disc drive, so when you open a terminal there the other way, make note of the device name and alter the last command accordingly.

θθθθθθθθθθθθθθθθθθθθθ

When the check is over, if there are any errors, it will tell you how many failed the test out of how many listed. In the following example, you are actually presented with two errors at the end, the first complaining of a missing file, the other reporting one that seems to be corrupt:

md5sum: WARNING: 1 of 102 listed files could not be read
md5sum: WARNING: 1 of 101 computed checksums did NOT match

You can then scroll up the terminal if need be and find those that didn’t pass:

md5sum: ./Wallpaper01.jpg: No such file or directory
./Wallpaper01.jpg: FAILED open or read

./Wallpaper002.jpg.jpg: FAILED

In this example, Wallpaper01.jpg is seen as “missing”, because it was in fact renamed to Wallpaper001.jpg (to keep in line with the 3-digit numbering of the rest of the files) after the checksum was created (so Wallpaper001.jpg is totally ignored, since there was no hash created for it, and Wallpaper01.jpg is seen as missing, since there is no longer a file of that name). Wallpaper002.jpg is probably corrupt, though not all files that do not pass the test fail to open (but, generally, the case is that the file is corrupt, and the larger the file, the  more chance there is of that).

Otherwise, if all you see is the command prompt with the last file above it with an OK next to it, then all is fine:

./Wallpaper100.jpg: OK

θθθθθθθθθθθθθθθθθθθθθ

To make all this easier, make command aliases, like make5 (to generate an md5sum.txt file), 5 (to check a folder) and cd5 (to check a disc that can be verified). This will save you memorising and typing long commands, or even copying and pasting from a text file of commands you’ve probably got (if you’re clever).

θθθθθθθθθθθθθθθθθθθθθ

To check a disc image or other file you’re downloading that has a checksum listed, you can generate a checksum, and simply compare the output with what is listed on the website:

md5sum name_of_the_image.iso

Obviously, you’ll need to replace the name in the example with the actual name of the file, but to save typing it if it is long, you can just enter md5sum (followed by a space), drag the downloaded file to the terminal and drop it there, then hit Enter (though you can, of course, just copy the file’s name as well). Then, as I said, simply compare the numbers in the terminal and website.

Now, if you’re downloading a bunch of stuff, all with checksums supplied, you can create your own master checksum file, which will check them all in one go when you’re ready. Syntax is very important, so the lines should look like this:

8790491bfa9d00f283ed9dd2d77b3906 *ubuntu-9.10-desktop-i386.iso
3faa345d298deec3854e0e02410973dc *ubuntu-9.10-alternate-i386.iso
dc51c1d7e3e173dcab4e0b9ad2be2bbf *ubuntu-9.10-desktop-amd64.iso

In this example, Ubuntu CDs are used, but they can be anything, as long as you lay it out like that. You can name the file what you want, but if you want to stick with tradition, and to make it easier to check  (via the command above, or its alias 5), name it md5sum.txt. And you can use this before you get all the files, as when you run the check, it will just tell you 2 out 0f 3 couldn’t be found (and you’ll see the one you did download listed, hopefully with an OK next to it).

If you name the checksum file something different, or in the case of the Ubuntu discs downloaded a master checksum file for all images, and it has a name like Ubuntu 9.10.MD5Sum (though that’s the name I actually gave it), it doesn’t matter. You can just enter md5sum -c (followed by a space), then either type the name of the file, or drag the file to the terminal. Note you can also do this with the alias 5 – it will complain it didn’t find md5sum.txt, but then go on to verify the files recorded in Ubuntu 9.10.MD5Sum (or whatever your file is called). Of course, you could just rename the checksum file to md5sum.txt, but as you can see, you don’t really need to.

θθθθθθθθθθθθθθθθθθθθθ

When you’re going to backup a folder to DVD, always run a check on it first. That way, if you’ve done something like renamed a bunch of files after the md5sum.txt file was created, you’ll know before burning a disc that will always come up with those “errors”. You can then either generate new checksums, or open md5sum.txt and replace the old names with the new ones (renaming files does not alter their checksum hashes, so you do not need to generate new ones for them).

θθθθθθθθθθθθθθθθθθθθθ

So, hopefully that’s all you need to get you going in setting up some data verification, which comes in handy when wanting to make sure all the data on a DVD is valid before passing it on, or deleting the copies off your hard drive if archiving. And now that you know what those hashes or .md5 files are on websites, make sure you grab them, so you can verify the integrity of your downloads. And if you set up those aliases, all of this becomes even simpler, as those names are short and easy to remember.

☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻☻

Did this information make your day? Did it rescue you from hours of headache? Then please consider making a donation via PayPal, to buy me a donut, beer, or some fish’n’chips for my time and effort! Many thanks!

Buy Ubuntu Genius a Beer to say Thanks!

Advertisements

Read Full Post »