OVH Community, your new community space.

Help needed to repair Raid


Bugsy
07-03-2011, 23:21
Thanks for your reply Myatu

copied raid1 setup from sdd

thats managed to fix raid 1 with all 4 drives...
Now need to fix raid0

Fingers crossed

Myatu
07-03-2011, 22:31
Quote Originally Posted by Bugsy
I think i sorted sda by mdadm --add /dev/md1 /dev/sda1 as that now shows as active and synced, but the problem is with /dev/sdc
How did Angie come up with it being RAID 5 It's RAID 1. Anyway, that's all you need to do for /dev/sdc (as you had done with /dev/sda). The rebuild should start automatically (and it being an exact copy of /dev/sdd, should not need a whole lot of time).

If it fails to sync, do an initial check with smartctl -l error /dev/sdc and check its output. You can alternatively do a full disk test with smartcrl --test=long, but keep in mind that on a 1.5TB disk this will take a very, VERY long time (you can keep tabs at it's progress with smartctl -a /dev/sdc). There's a --test=short test as well, but is not as comprehensive and can miss out on some particulars (ie., motor failure or overheating).

Quote Originally Posted by Bugsy
Why does it take over 24 hours to get a simple reply to a question to a ticket ???
Forum response: 1 hr and 25 minutes

PS: Just noticed the "Disk /dev/md1: 2097 MB" - is the rest partitioned individually?

Bugsy
07-03-2011, 21:06
Why does it take over 24 hours to get a simple reply to a question to a ticket ??? Replied to a ticket on Date: 2011-03-05 03:26:06 got the reply below on Date: 2011-03-07 13:56:20... 2 days
Hi,

I have checked and no one of the disk has samrt errors:
ns311690:~# smartctl -a -d ata /dev/sda
ns311690:~# smartctl -a -d ata /dev/sdb
ns311690:~# smartctl -a -d ata /dev/sdc
ns311690:~# smartctl -a -d ata /dev/sdd

sdc: you must reconfigure it and readd it in the RAID.
For do this you must first fix your raid. As is an RAID 5
you must check the problem with the other disks first.

ns311690:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6]
[raid5] [raid4] [multipath] [faulty]
md1 : active raid1 sdb1[1] sdd1[3]
2048192 blocks [4/2] [_U_U]


So add first sda: sync properly the raid and than configure
and add sdc.

Best regards,
Angie

I added to the ticket again and know full well it will be another 24hrs before i get reply, hence why i'm looking for help here as the ticket system is a joke !!!
This is so frustrating !!!!!
Any advice will be a huge help

Bugsy
07-03-2011, 19:16
One of my servers dropped its raid array i think, opened a ticket and after one of the ovh tech's looked at it replied with this :

Hi,

I have checked and no one of the disk has samrt errors:
ns311690:~# smartctl -a -d ata /dev/sda
ns311690:~# smartctl -a -d ata /dev/sdb
ns311690:~# smartctl -a -d ata /dev/sdc
ns311690:~# smartctl -a -d ata /dev/sdd

sdc: you must reconfigure it and readd it in the RAID.
For do this you must first fix your raid. As is an RAID 5
you must check the problem with the other disks first.

ns311690:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6]
[raid5] [raid4] [multipath] [faulty]
md1 : active raid1 sdb1[1] sdd1[3]
2048192 blocks [4/2] [_U_U]


So add first sda: sync properly the raid and than configure
and add sdc.

Best regards,
Angie
This means nothing to me "So add first sda: sync properly the raid and than configure" as i have never rebuild a raid array before, so if anyone can help it would be apprecited.

root@rescue:~# fdisk -l

Disk /dev/sda: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000b4567

Device Boot Start End Blocks Id System
/dev/sda1 * 1 255 2048256 fd Linux raid autodetect
/dev/sda2 256 447 1535713 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sda3 447 182401 1461552032 fd Linux raid autodetect

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x72968492

Device Boot Start End Blocks Id System

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00009e79

Device Boot Start End Blocks Id System
/dev/sdd1 1 255 2048256 fd Linux raid autodetect
/dev/sdd2 256 447 1535713 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sdd3 447 182401 1461552032 fd Linux raid autodetect

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000c8f61

Device Boot Start End Blocks Id System
/dev/sdb1 1 255 2048256 fd Linux raid autodetect
/dev/sdb2 256 447 1535713 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sdb3 447 182401 1461552032 fd Linux raid autodetect

Disk /dev/md1: 2097 MB, 2097348608 bytes
2 heads, 4 sectors/track, 512048 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000

Disk /dev/md1 doesn't contain a valid partition table

root@rescue:~# mdadm -D /dev/md1
/dev/md1:
Version : 00.90
Creation Time : Thu Apr 1 19:47:52 2010
Raid Level : raid1
Array Size : 2048192 (2000.52 MiB 2097.35 MB)
Used Dev Size : 2048192 (2000.52 MiB 2097.35 MB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Mon Mar 7 19:59:17 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

UUID : 8cf12db1:a9bfc6f0:a4d2adc2:26fd5302 (local to host rescue.ovh.net)
Events : 0.3587068

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 0 0 2 removed
3 8 49 3 active sync /dev/sdd1

I think i sorted sda by mdadm --add /dev/md1 /dev/sda1 as that now shows as active and synced, but the problem is with /dev/sdc

root@rescue:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md1 : active raid1 sdb1[1] sdd1[3] sda1[0]
2048192 blocks [4/3] [UU_U]

unused devices:

Any advice on what i need to do now would be a great help as i'm stuck now
Thanks