TuX as LLG logo
Driver Suite for Linux
csv2iif.pl suite
convert PayPal transactions to IIF, OFX, QIF
DMX30 Interface
128Ch SPP
DMX43 Interface
2out 2in EPP
LED Hardware
for Linux and Windows
EPROM Sampler
for 8 bits of sound
Linux drivers for MK3/4 PCI
PostgreSQL replication
C exception/signal handling lib
to various software
and small scripts
misc documents
to lighting stuff

How to rescue a partially damaged hard disk

The HD of my work computer was broken. It failed to read some sectors, while working ok otherwise. The following text shows, how I recovered the filesystem. A couple of years later Florian Hackenberger pointed out, that dd can cope with read errors if you supply the correct options. So you could also use something like: dd if=/dev/hdc1 of=disk.img bs=512 conv=noerror,sync. Even more comfortable rescue is possible with the specialized tools ddrescue or dd_rescue.

My completely manual method

I got myself a second computer with a big empty harddisk and a running linux. I installed the broken HD on the secondary IDE channel. My plan was to get a raw image of the main partition, mount it again with linux's loop option and copy the data onto a new HD.

The broken HD is /dev/hdc . Partition 1 is the / file system. Partition 2 was my swap device. I read in blocks of 1024 bytes.

[root@base /tmp]# dd if=/dev/hdc1 of=1 bs=1024
dd: /dev/hdc1: Input/output error
7332+0 records in
7332+0 records out

The first part of the disc was read. Obviously the 7333th sector of the HD is damaged. I tried a dd if=/dev/hdc1 of=tmp bs=1024 skip=7332 count=1 and got the IO error again. So we just skip this sector and proceed with the area after.

[root@base /tmp]# dd if=/dev/hdc1 of=2 bs=1024 skip=7333
dd: /dev/hdc1: Input/output error
385907+0 records in
385907+0 records out

We encountered another broken sector. We just skip it. Be carefull about the new skip value. You have to sum up all previous skip values: 7333 + 385908 = 393241 !

[root@base /tmp]# dd if=/dev/hdc1 of=3 bs=1024 skip=393241
1219527+0 records in
1219527+0 records out

Now we finally managed to read up to the end of the HD. Now we have three pieces each with a hole of 1KB. Now we make ourself 1K of plain data.

[root@base /tmp]# dd if=/dev/zero of=z bs=1024 count=1
1+0 records in
1+0 records out

Now we copy the files together and we have a nearly perfect image of our broken HD.

[root@base /tmp]# cat 1 z 2 z 3 > hd

We should now run fsck on the image, because there will be at least 2 errors resulting from those two 1K holes, which we filled with 0 values.

[root@base /tmp]# fsck.ext2 -a hd
hd contains a file system with errors, check forced.
.... with some repair msgs from fsck

We now can mount this HD image to /mnt/old.

[root@base /tmp]# mkdir /mnt/old
[root@base /tmp]# mount hd /mnt/old -o loop

Our new HD is mounted on /mnt/new. We have already partitioned it with fdisk and made the filesystem with mke2fs. Now we have to copy the old to the new harddisk. We use tar so all special files and file permissions are preserved.

[root@base /tmp]# (cd /mnt/old; tar c *)|(cd /mnt/new; tar xv)

One small problem is left. The new HD is not able to boot by now. We make a boot disk from the kernel image.

[root@base /tmp]# cat /mnt/new/vmlinuz > /dev/fd0

Now we have to set the root device the boot disk should boot from. In my case this is /dev/hda1.

[root@base /tmp]# rdev /dev/fd0 /dev/hda1

Congratulations. We are finished. We have to install the new HD back into our computer and boot once from the boot disk. Then I execute a lilo command, so a new boot sector is written.

Now we just have to clean up our files in the /tmp directory of our rescue computer and we are finished.

Automated Tool

A tool which automates the dump of the broken disk is ddrescue. It will intelligently skip over broken areas on the disk and can attempt to read a sector multiple times to increase the chance of getting results.
A similar program is dd_rescue.

Restoring the partition table

If your partition table was destroyed, chances are good that you may be able to create a new table from information inside your partitions. I can recommend the program testdisk.

Restoring a filesystem

Collin Park describes his attempt to make a corrupted FAT filesystem readable again.

Monitor your harddrive

The smartmontools have a daemon which monitors your hard drive via the S.M.A.R.T interface, which should be present on all modern drives. In case of failure it can perform various actions, like sending email, shutting down the system etc. so the drive won't be damaged any further.


Please contact me for any reason: doj@cubic.org

http://llg.cubic.org © 2001-2017 by Dirk Jagdmann