OVH Community, your new community space.

RBX2 S06 issues?


marks
24-11-2010, 11:10
hey olliegooch

there are already several categories: Domains names, Hosting, Dedicated servers, Datacentre, Network and racks, ....

Also, there are 3 types of tasks:

a) Maintenance
b) Incident
c) Modernization

Unfortunately, the RSS feed is only for the whole of the tasks posted in status.ovh.net. But with a good email client, it should not be difficult to filter the post according to your relevance.

e.g.: "Maintenance" + "Modernization" in "Dedicated servers", "Datacentre" and "network and racks"

We've noted down your suggestions. We're working to make status more up-to-date quicker so it's more useful for you.

Please, do let us know about your suggestions.

olliegooch
23-11-2010, 23:17
Quote Originally Posted by fozl
Sure you can. Coppers on you though.
Hi Fozl,

Would it be possible for the OVH status (Either french or the english) to have certain "categories" attached to the status update? (These would need to be static, user selectable as apposed to a free text box, since your likely to get alot of deviation if its free text)

So for instance, this page (for this outage):

http://status.ovh.net/?do=details&id=836

If on that page there was a section for "Effected Services" and a section for "Severity"

I understand that it would require work for OVH to setup a notification system for maintenance / incidence, but from my point of view if there are consistent categorisation and severities I can write stuff to scrape that and notify me if anything going on that I should be aware of.

Ollie

Andy
23-11-2010, 19:21
My point was, even fail over systems fail on occasion. It's impossible to guarantee 100% uptime for any service. OVH sorted the problem in less than a few minutes, and while it was an inconvenience they did their best to get everything back up. You have to understand that these things happen sometimes.

Even Google have had their fair share of down time and they have what is probably the BEST infrastructure in the world designed for fail overs and redundancy.

I for one am happy they sorted it so quickly and kept us informed as to the cause on the status site.

HandsomeChap
23-11-2010, 19:07
Hi Andy my servers were hit with power outtage before because the UPS did not kick in intime, not the first time, maybe the first time for this exact problem, but not the first time overall, and explaining again to customer that there was a power outtage in a data centre sounds like were making it up with all the backups and protection that is in place. I'd accept it if I was hosting my server in some cheap-ass converted room with second rate systems, etc, but really, a structure such as RBX2 which is built to such a high standard and still they cant keep the juice flowing? Not acceptable.

fozl
23-11-2010, 17:15
Point is noted. The plan (and I don't have an eta for this) is that in future you will be informed when a specific incident or intervention will or may affect your service. So you won't be spammed with service status information that does not affect you.

hokapoka
23-11-2010, 17:04
One assumes that status update was a translation into English from French as it's ever so hard to comprehend.

Reading the status update : "Today, we have scheduled the maintenance tasks "

It appears that the some physical maintenance was scheduled ahead-of-time, is there some for of email notification of when these sort of tests & modifications might be taking place?

Where I have Co-Lo or Dedicated servers with other ISPs they offer such a notification.

In that, whenever tests, upgrades or modifications that pertain to any of the UPS, Generators or Network pipes are scheduled at least 24 hours notice is given. Just to forewarn the sysadmins that there might be some for of outage.

I wholly understand the possibilities of unforeseen technical issues and power outages. Also it's very easy to understand how trying to fix an unknown problem takes time and effort, for which I am grateful that such efforts are being made to resolve these issues.

However, if they are the result of a test, upgrade or scheduled maintenance/modification then, by definition, they are as a result of a physical action that is known ahead of time.

Thus a warning could be given that *something* is going to be undertaken which would allow us sysadmins to be prepared for a possible outage.

Andy
23-11-2010, 15:15
Info now available on http://status.ovh.net/?do=details&id=836

HandsomeChap, this is an unusual error in the system, it has not happened before even for the UPS manufacturer, so they are investigating why. They cannot fix something if they don't know the origin of the error.

HandsomeChap
23-11-2010, 14:01
Great I've got a few reels of mains cable, I'll solder them together and I'm sure they will stretch.

Seriously though, why is room 26 so broken, can the next step please be to replace all the UPC batteries and make sure they actually charge this time?

fozl
23-11-2010, 13:45
Sure you can. Coppers on you though.

HandsomeChap
23-11-2010, 13:21
The curse of room 26!

Yet again we victims of a powercut in a data centre, pretty ironic really.

My home PC's been online for 192 days, no APC, no diesel generator in the shed, no dual power lines, in fact no redundancy at all, just plain old british national grid.

Question for OVH: Can I run a power cable from here over to RBX2 to keep my servers online please?

Andy
23-11-2010, 12:35
About 15 mins later and servers are all coming back up...


hokapoka
23-11-2010, 12:24
OMG DOWN DOWN and DOWN!

Andy
23-11-2010, 12:17
What's happened in S06 in RBX2? Hundreds of servers are down including mine (scrub that, it's just started pinging again).

http://status.ovh.net/vms/index_rbx2.html



Even so, what happened? My server actually rebooted. Power outage?