We are in the process of migrating this forum. A new space will be available soon. We are sorry for the inconvenience.

iscsi32.rps.ovh.net still unstable


BELLonline
25-06-2009, 14:28
Update: Support staff useless. Cancelling server for replacement (which I should have done weeks ago).

OVH management: Your support staff really need to up their game.

BELLonline
25-06-2009, 12:33
It is CentOS 5.3 but it looks like someone is at least answering my tickets now which is all I wanted

Myatu
25-06-2009, 04:04
Hmm, definitely not a good thing indeed. Now, which OS are you using on your RPS? Centos 5.2 has a kernel bug that affects iSCSI (fixed in 5.3), thinking a ping had timed out and giving a connection error, even though everything is up and running just fine.

BELLonline
24-06-2009, 21:52
I've managed to reinstall a new operating system on it but it's still having problems. The hard drive has gone in to read only mode and I know that if I reboot the server then it will die completely

This is the last message in the server logs

Jun 24 16:47:20 stock kernel: connection1:0: ping timeout of 5 secs expired, last rx 62006310, last ping 62007560, now 62008810

There are also previous errors like this:

Jun 21 19:52:01 stock kernel: connection1:0: detected conn error (1011)
Jun 21 19:52:02 stock iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
Jun 21 19:52:06 stock iscsid: connection1:0 is operational after recovery (1 attempts)


It might be something as simple as a loose connection, but as I'm not prepared to fly to France and knock on the door to get support then there's not much I can do about this.

I'll probably let the server expire and someone else can have the ball ache.

Myatu
24-06-2009, 20:05
Quote Originally Posted by BELLonline
Disk sda (10 GB)
SMART state: ERROR [Logs]

Click [Logs]:

smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
This shouldn't stop your RPS from booting, as it simply doesn't have any HDs at all (only the remote iSCSI). This probably only has to do with being in the Rescue Mode (so you can ignore it). But if this is in fact an error that also shows when booting your RPS normally, smartmon it needs to be disabled.

Since you're able to get into Rescue Mode, you can browse the HD's contents. Go into your /var/log folder and pay particular attention to the syslog file (/var/log/syslog). It includes the kernel boot messages, so going through these (and looking at the timestamps), you can see at what point your server fails.

DedicatedPros
24-06-2009, 19:52
Call them, I would've called the day after filing a ticket

BELLonline
24-06-2009, 19:51
I've had a support ticket open for almost 3 weeks now, I'm pretty sure there have been enough working hours since then for them to reply.

Assuming 8 working hours per week day:

19 days since ticket opened. Minus 6 weekend days = 13

13 days x 8 hrs = 104 working hours.

I think that's plenty of working hours

still no reply.

Ashley
21-06-2009, 19:15
You really need to contact them within working hours.

BELLonline
21-06-2009, 18:28
Come on, I spend nearly 600 a month with your company and no one is bothering to answer my ticket. I didn't want to but I'm going to try a reinstall, I can't see it working though if the server can't communicate with the hard disk.

BELLonline
18-06-2009, 15:12
Hi,

I've been waiting for 2 weeks for a response from support on this issue.

Please respond:

Ticket number 146732
Date opened: 2009-06-05 14:25:03
Last Reply (be me): 2009-06-13 18:03:14
Status: Awaiting
Domain: r23636.ovh.net

Luckily, this isn't a hugely important server so I've only lost about 50-60 as a result, but it's still irritating that no one is prepared to answer my ticket when it is clearly a hardware failure. There's no point me reinstalling the server because it can't access the hard disk to do anything.

fozl
12-06-2009, 16:29
I've added what you've said above to the ticket that's already open about this.

BELLonline
12-06-2009, 14:14
My server r23636.ovh.net still doesn't boot up into anything other than recover mode - although I've set the Netboot to HD. Something isn't fixed, is the hard disk corrupt or something? I'd rather not have to reinstall it but if I do then I'd rather know.

When I go to reboot the server, it fails and I get a "defect on ... server" email - about an hour later I get another email saying that an intervention has been completed (no details of what they have done in the ticket) and all they seem to have done is to boot it up into recover-pro mode.

The following error comes up:

Disk sda (10 GB)
SMART state: ERROR [Logs]

Click [Logs]:

smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

marks
08-06-2009, 10:36
It was fixed on Saturday 06/2009, 08:17. Sorry for the inconviniences

BELLonline
07-06-2009, 15:34
Is there an ETA on when iscsi32.rps.ovh.net will be fixed? It's making some of my RPS servers unusable, because the drive is showing as Read Only.

According to this, it's fixed but that doesn't seem to be the case:

http://travaux.ovh.net/?do=details&id=3135