OVH Community, your new community space.

mdadm issues can anyone help?


Myatu
12-05-2013, 22:56
It *may* have been smart enough, provided you used 'mising'. If not, it would be a fairly accurate assumption that your data has been vaporised. (Re)create should only be used as the very last resort.

zimsters
12-05-2013, 21:18
Quote Originally Posted by Myatu
Looks healthy, but just how did you do the 'recovery' then (hopefully not with --create or --build)? What does 'fsck.ext3' give for the raid partition in question?
tried assembly a few times. following this got advice to try using --create with 3 of the drives and it is smart enough to recognise that there was a preexisting raid and just mount/reassemble it. I tried that and it did correctly show 'recovering...' before the whole thing was done.

Myatu
10-05-2013, 22:47
Looks healthy, but just how did you do the 'recovery' then (hopefully not with --create or --build)? What does 'fsck.ext3' give for the raid partition in question?

zimsters
09-05-2013, 11:34
Quote Originally Posted by Myatu
A minimum of 3 HDDs are required for RAID5, so if at one point you were missing more than 1 out of the 4, this could spell some trouble.

What's the output of 'cat /proc/mdstat' - Did it really complete?

Also, how does mdadm see the RAID now ('mdadm --detail /dev/md3')?

root@server:~# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4]
md3 : active raid5 sdb3[3](S) sdd3[2] sdc3[1] sda3[0]
1910460672 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md1 : active raid1 sda1[0] sdb1[1]
20480896 blocks [4/2] [UU__]

unused devices:

root@server:~# mdadm --detail /dev/md3
/dev/md3:
Version : 00.90
Creation Time : Wed May 8 16:02:41 2013
Raid Level : raid5
Array Size : 1910460672 (1821.96 GiB 1956.31 GB)
Used Dev Size : 955230336 (910.98 GiB 978.16 GB)
Raid Devices : 3
Total Devices : 4
Preferred Minor : 3
Persistence : Superblock is persistent

Update Time : Thu May 9 11:32:48 2013
State : clean
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

UUID : c94974fa:532cd854:2a9f3f17:ad6d7156 (local to host londonpower.etbox.info)
Events : 0.4

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 35 1 active sync /dev/sdc3
2 8 51 2 active sync /dev/sdd3

3 8 19 - spare /dev/sdb3

sdb3 is the drive that crashed. it was replaced. raid was rebuilt too and it worked fine.

reasons it's showing as spare now as because i tried reassembly with only the 3 working 'old' drives so it's only a 3 disk raid at the moment.

Myatu
08-05-2013, 23:26
A minimum of 3 HDDs are required for RAID5, so if at one point you were missing more than 1 out of the 4, this could spell some trouble.

What's the output of 'cat /proc/mdstat' - Did it really complete?

Also, how does mdadm see the RAID now ('mdadm --detail /dev/md3')?

zimsters
08-05-2013, 07:32
Quote Originally Posted by Thelen
That netboot thing says raid 1, only 2 of 4 mirrors loaded etc. Seems to me the mdadm config is wrong.

How do you know the raid partition was rebuilt?

And, might need to check the config manually to see that it is indeed using the 4 partitions, doesn't seem to be from that screenshot. I'd boot into kernel rescue and see from there what the config looks like.

The last thing, I know there are problems using raid with boot (and sometimes root), I can't remember what you have to do to make it work, but might need to look into that, the rebuild might have broken something there.

Ok - box now boots following some grub installs on each of the drives.

I mounted 3 of the raid 5 drives that make up md3 and then added the 4th back. It took about 5 hours to do a 'recovery' and now the issue when I try mounting it is:

mount: wrong fs type, bad option, bad superblock on /dev/md3,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

dmesg:

disk 0, o:1, dev:sda3
disk 1, o:1, dev:sdc3
disk 2, o:1, dev:sdb3
disk 3, o:1, dev:sdd3
VFS: Can't find ext3 filesystem on dev md3.
Any thoughts?

Thelen
08-05-2013, 05:56
That netboot thing says raid 1, only 2 of 4 mirrors loaded etc. Seems to me the mdadm config is wrong.

How do you know the raid partition was rebuilt?

And, might need to check the config manually to see that it is indeed using the 4 partitions, doesn't seem to be from that screenshot. I'd boot into kernel rescue and see from there what the config looks like.

The last thing, I know there are problems using raid with boot (and sometimes root), I can't remember what you have to do to make it work, but might need to look into that, the rebuild might have broken something there.

zimsters
07-05-2013, 22:08
Background:

Kimsufi box, 4 harddisks.

Raid 5 set up:

MD1 = boot = sd[abcd]1
MD3 = data = sd[abcd]3

I had SDB fail. It was replaced and the raid partition was successfully rebuilt.

Server was rebooted. It doesn't boot now.

Kernel rescue mode: shows the raid arrays are active and healthy

Vkvm:
http://i.imgur.com/9WD3XwN.png when I boot from harddisk
http://i.imgur.com/7clanEF.png when I boot with netboot.


Any help would be immensely appreciated!