Tips to fixing a broken RAID 1

As often, I don’t really have the time to turn this into a proper guide. Plus some others do exist already, such as Replacing A Failed Hard Drive In A Software RAID1 Array. So I’ll just list some useful commands to diagnose a broken RAID and fix it in the case where there is no physical issue on the disk (i.e. I won’t cover fixing bad sectors, but I will cover checking for bad sectors).

1) Detecting the RAID is out of sync:
cat /proc/mdstat
If the RAID is synchronized, you’ll see [UU], if not you’ll see [U_] or [_U], such as:

cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda3[0]
      972141184 blocks super 1.2 [2/1] [U_]

md1 : active (auto-read-only) raid1 sda2[0]
      1999040 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0]
      488192 blocks [2/1] [U_]

unused devices: <none>

2) Checking for bad sectors, using smartctl. NB: if you don’t have it installed (possible if your host provided you with a “minimal” install), you should be able to install it with apt-get install smartmontools on Debian/Ubuntu.
Short tests (about a few minutes each):
smartctl -d ata -t short /dev/sda smartctl -d ata -t short /dev/sdb
Reading the results
smartctl -a /dev/sda smartctl -a /dev/sdb
If short tests show no issues, run the longer tests (about a few hours):
smartctl -d ata -t long /dev/sda smartctl -d ata -t long /dev/sdb

3) If there are errors on one of the drives, use hdparm to get more info to identify the drive (notably its model name and serial number), so as to be able to ask your host to change the right drive.
hdparm -I /dev/sdb hdparm -I /dev/sda

4) To list the partitions on all disks (it can be interested to compared this to the output of cat /proc/mdstat)
fdisk -l

5) In the case I had, it turn out that my RAID had desynchronized, but no disk was damaged: only the second drive had stopped and fallen out of the RAID for some reason… So I was able to reconstruct the RAID at once, by running the following command for each partition (do not forget to adapt the names of both the RAID partition and the physical disk partition):
mdadm --manage /dev/md2 --add /dev/sdb3

And then a few snapshot of reconstruction progress (it takes quite a bit of time, of course, since all the disk is read, not just the used space):

cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb3[2] sda3[0]
      972141184 blocks super 1.2 [2/1] [U_]
      [=>...................]  recovery =  8.4% (82301952/972141184) finish=1017.9min speed=14568K/sec

md1 : active (auto-read-only) raid1 sda2[0]
      1999040 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0]
      488192 blocks [2/1] [U_]

After the md2 partition is resynchronized:

cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb3[2] sda3[0]
      972141184 blocks super 1.2 [2/2] [UU]

md1 : active (auto-read-only) raid1 sda2[0]
      1999040 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0]
      488192 blocks [2/1] [U_]

Adding back the md1 one:
mdadm --manage /dev/md0 --add /dev/sdb1

And now after md0 was done too:

cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sdb3[2] sda3[0]
      972141184 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sdb2[2] sda2[0]
      1999040 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
      488192 blocks [2/2] [UU]

Bonus because I don’t want to lose the link and I’m not sure it deserves an entire new post: a few methods to wipe empty disk space (I usually do that before giving back the servers I rent): http://superuser.com/questions/19326/how-to-wipe-free-disk-space-in-linux

Posted in hardware, Linux, servers.

rev="post-4000" No comments

By patheticcockroach – 2013-12-20

0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

« How to unban peers in qBittorrent How to install MJ12node (the Majestic-12 distributed crawler) on Debian 7 x64 »

Tips to fixing a broken RAID 1

0 Responses

See also…

Recent Comments

Meta

Calendar

Archives

Tips to fixing a broken RAID 1

0 Responses

Subscribe

See also…

Recent Comments

Meta

Calendar

Archives