OVH Community, your new community space.

How do you (clients of OVH) manage disk failures?

06-01-2014, 23:24
I also do backup of backups to two locations in case original backup fails (after it happened once...)

04-01-2014, 17:50
I run raid1 but only because I cant run raid5 on two drives.
I ony back up things I cannot recreate, so rsync is fine for those cases where my Kimi isn't already using a copy.

In the face of total hdd death, it would take me a day or so to get back on the air, as I would need to rebuild my gentoo-hardened install and I don't back up the binaries. That suits my use case.

I've lost two HDD in the last few years, both wihin a few minutes of each other and both in the same raid5 set

04-01-2014, 16:20
With OVH, I've had two HDD issues in the past, but they were resolved in a fairly reasonable fashion (certainly not more than 24 hours).

I also use at the very least RAID 1. As that's not a guarantee that no loss of data will occur, I used to use Duplicity for backups, but have switched to BackupPC for its ease of management and de-duplication across all the backups, rather than a single backup as with Duplicity.

The BackupPC server does daily incremental backups of each server and retains 14 of them, and does a full backup every week, 2 of which are retained. For database servers it does a full dump prior to backup, and removes them once backup has been completed.

04-01-2014, 15:21
I run daily backups to home for website data and 3 hourly backups to home for databases. All data is archived here for 7 days. I've never lost any data to date by doing this.

OVH's response to disk failures is appalling, you can see my experience here: After 6 years with OVH I decided to move to a company who cares about its customers. It was a hard decision but it's clear that OVH don't value it's loyal customers

My advice is to do one of the following:
1. Backup to a server at home regularly (if you have the upload bandwidth to restore it if required).
2. Purchase another cheap server from OVH or another provider and backup to that.

In my opinion, monthly or even weekly backups isn't often enough. A lot can change in 24 hours so a lot more is going to happen over a week. At the very least you should backup your databases daily as they are likely to change more often.

I'm a bit of a backup buff because I hate losing things. I even made my own backup scripts to manage the archiving. I'll explain my process in case it's of interest to you...

1. The server initiates a backup to the backup server via FTP, using a piece of software called SyncBackSE. It's capable of remembering the directory structure so it doesn't have to scan the FTP each time and allows partial backups.

2. The main data copies at 11pm daily, the databases copy every 3 hours from 12am, 3am, etc.

3. On the backup server, disk 1 is the data drive where the backups copy to. Disk 2 is the archive disk. On a set schedule (about 1.5hrs after the main server to backup server copy), it copies the relevant directories from disk 1 to disk 2 into an archive folder and deletes the oldest backup. Disk 1's data always remains as the "latest" backup just in case the archive disk fails (just another layer of backups really).

4. I get emailed to tell me the status of that backup run.

It's all automated once setup and has only failed to work 2 or 3 times in 4 years and those were due to the cron not firing or the disk being full.

In total, it handles about 50GB of backups, but I don't doubt it could cope with a lot more.

Since I've been with OVH I suffered 4 disk failures, and not a single one was handled well, so I would make sure you have a backup plan in place should it happen in the form of another server you can fail over to or something else.

03-01-2014, 19:43
Touch wood, I've only had one HD go since 2008, just wanted to get some ideas to minimise the downtime if and when it happens again.

03-01-2014, 19:37
I am lucky enough to have not had one fail yet.

All my projects are backed up everyday so if I knew a failure is imminent I can transfer theme to the backup server and point everything to that server while the ticket is sent for a replacement HDD.

All drives fail I guess its just when.

03-01-2014, 18:59
I had one fail in a Kimi last July, it totally died - took roughly 24hrs to be replaced.

Despite all the talk about hard drives should have a decent SLA window (all servers), I'm just wondering how do you all cope when a hard drive isn't totally gone? In the sense that if you open (or request) a ticket, you do not actually know when the drive will be replaced even when you give the tech the go ahead. I know most of you have a backup plan in place, but if you run software that really can't do redundancy 100%, you're playing a waiting game while new data is being written to disk, etc etc.