OVH Community, your new community space.

Random Downtime


N-joi
02-08-2008, 13:55
I've rebooted the system twice now - First when i discovered the problem, and the second time was this morning.

I can connect through remote desktop, just unable to use my connection.

BELLonline
02-08-2008, 13:01
No problems here - did you try rebooting it? that will be much quicker than contacting support. The chances are it just crashed.

N-joi
02-08-2008, 10:04
I've emailed support, but still awaiting for them solve it.

kayomani
02-08-2008, 09:12
I am from the UK and havent been having any connection problems to my server.

N-joi
02-08-2008, 04:14
I have totally lost connection on server, is anyone else having the same problem.

according to my stats, its went down at 1AM.

Marco
29-07-2008, 09:56
I apologize as it's definitely 1 Gbps and not 1 GBps and I was wrong.

I confirm that we had 1Gbps link working in Linx up to now. Not all UK traffic was passing through linx, but the one directed to some internet service providers like BT.

Currently all the traffic that was passing through Linx, it's now passing through Globalcrossing in Paris.

We are going to upgrade the network in London and we will have in 2-3 days 10Gbps in Linx and 10Gbps in Globalcrossing (in London).

The linx link was working fine up to now, as up to Friday of the past week, it was not fully exploited . It has now been replaced temporarily and it will be re-established soon.

In order to check the network usage, you can go to the following link:
weathermap.ovh.net/

IainK
28-07-2008, 21:59
1 GigaByte/second? So 8 bits per byte works out 8Gigabits per second?
If you intended 10 gigabits/second then you should either do the correct maths or stick to gbps.

GigaBytes/second has never been a standard measurement of line speed!

Andy
28-07-2008, 18:53
Quote Originally Posted by Marco
The UK peering link is 1GB and not 1Gb. Moreover we have all our transit links which work everywhere in the world. The issue has now been solved, as we have switched the traffic to a transit link and switched momentarily off the UK peering link. This will cause a bit of latency, but at least there will not be any more packet loss.
Anyway very soon the network will be upgraded and the UK peering link re-established.
Marco.
It would be helpful to everyone if you use a standard measurement of speed rather than switching between two and confusing everyone.

Can you please confirm:-

Currently, LINX = 10Gbps?
and will become 2x10Gbps?

Marco
28-07-2008, 17:48
The UK peering link is 1GB and not 1Gb. Moreover we have all our transit links which work everywhere in the world. The issue has now been solved, as we have switched the traffic to a transit link and switched momentarily off the UK peering link. This will cause a bit of latency, but at least there will not be any more packet loss.
Anyway very soon the network will be upgraded and the UK peering link re-established.
Marco.

IainK
28-07-2008, 16:38
The UK peering is only 1gbps and you have sold how many UK servers?! Really, this should be absolute top priority!!

Marco
28-07-2008, 16:20
Hello,
We are experiencing some problems with our network peering connection to UK.

You can check the status of the issue on the following link:
http://travaux.ovh.net/?do=details&id=2343

We are going to soon (this or the next) upgrade our network connection in UK, so that instead of having 1GB of peering we will have 2*10GB. Therefore afterwards this kind of problems will no longer exist.
Marco.

Andy
28-07-2008, 11:47
Quote Originally Posted by BELLonline
Lol that's clever - they have people telling them that there is a problem with a backbone router and they ignore it
Perhaps its not apparent to them. If it was the ticket would still be open.

BELLonline
28-07-2008, 11:21
Lol that's clever - they have people telling them that there is a problem with a backbone router and they ignore it

Andy
28-07-2008, 01:15
Although not reported on their ticket site the router th2-1-6k might still be having issues. It is a backbone router and if you're being routed through it, it will cause problems. The two backbone routers are generally load balanced so you might be unlucky and going through the bad one. I personally have seen no issues. I set a ping going today and after 6 hours i had about 40000 pings with just 100 of them being dropped, which is about normal loss anyway.

I suggest anyone with issues ring OVH in the morning and get them to investigate. They often ignore any problems on the forum unless its order related.

JALZOO
27-07-2008, 23:30
Damn, 18 hops to your servers and its only in france!..

BELLonline
27-07-2008, 16:01
It doesn't seem to me like OVH are aware or bothered about this, hopefully they will notice this thread and see how many people are affected.

Here's my MRTG to show how bad it is!

Graph for 26-27th July 2008:

v0x
27-07-2008, 15:16
15:16 BST, we just got hit with another 30-60 second chunk of downtime. This is really destructive!

Here is a post-event trace route, i'll try and get one while it's happening next time.


Tracing route to * [213.186.57.*]
over a maximum of 30 hops:

1 3 ms <1 ms <1 ms 192.168.1.25
2 30 ms 28 ms 28 ms *[*]

3 29 ms 29 ms 29 ms *
4 29 ms 29 ms 29 ms 213.123.80.6
5 31 ms 29 ms 29 ms 217.41.171.9
6 30 ms 29 ms 29 ms 217.41.171.66
7 29 ms 30 ms 28 ms 217.41.217.58
8 29 ms 29 ms 30 ms 217.41.217.38
9 29 ms 29 ms 29 ms 217.47.66.98
10 28 ms 29 ms 29 ms 62.6.40.98
11 31 ms 29 ms 31 ms core1-pos0-0-0-7.ilford.ukcore.bt.net [62.6.204.
242]
12 31 ms 28 ms 30 ms transit1-gig7/0/0.ilford.ukcore.bt.net [194.74.7
7.170]
13 29 ms 29 ms 29 ms t2c1-ge14-0-0.uk-ilf.eu.bt.net [166.49.168.89]
14 31 ms 30 ms 31 ms pos4-0-2.ar6.lon3.gblx.net [64.212.225.13]
15 * 38 ms 36 ms 020g.gsw-1-6k.routers.ovh.net [213.186.32.130]
16 202 ms 203 ms 203 ms 040g.p19-7-6k.routers.ovh.net [213.186.32.145]
17 40 ms 38 ms 38 ms p19-4-m1.routers.ovh.net [213.186.32.18]
18 39 ms 38 ms 37 ms * [213.186.57.*]


Identifying information removed.

While obviously the times don't mean much due to deprioritisation, the path should show something.

IainK
27-07-2008, 13:44
It would be nice if staff could check in here, but sadly I recon they are far too busy as it is.
Nice one pointing them in the right direction though BELLonline!

BELLonline
27-07-2008, 13:41
I actually opened a ticket a couple of days ago and got a reply today asking me to confirm if it is still happening so I said it is and directed them towards this post, as it seems they aren't looking in here at the moment.

IainK
27-07-2008, 13:37
Seems OVH support still don't work weekends, I just tried to call. I hope that this issue is at least being looked into.

I seem to remember seing a maximum downtime promise of 45 minutes a month, if this is dropping out multiple times a day on the UK peering no less to UK customers I think that could be considered a refundable offense!

BELLonline
27-07-2008, 12:04
I've found that the connection is dropping a lot from the UK but it seems fine from locations in the USA - quite ironic really that the reason they only accept customers from certain countries is that they "can only guarantee good connectivity to those countries". It seems odd that a country not on the list is experiencing much better connectivity.

Can someone from OVH please confirm that this issue is being looked at?

Sean
27-07-2008, 09:48
Quote Originally Posted by v0x
As a follow up to my post, the following event on the event tracker translates as:

http://travaux.ovh.net/?do=details&id=2339



This would seem to fit the bill.(Although I'm unaware of what this particular link serves, reboots would fit the 60-120seconds average downtime that have been happening bi-hourly over the last 18 hours)

Again, a much more detailed explanation would be appreciated - I'm also interested to know why what, as far as I can diagnose, is a fairly large network fault, has gone unaddressed/announced/dealt with for so long?

I held off calling tech support this time, next time, I won't.
Couldn't agree with this post more, it would be great if someone from OVH would let us know what's happening. I am still being effected by this.

Nettus
27-07-2008, 02:13
funny thing, after i posted on here my server went down for a few moments

BELLonline
27-07-2008, 01:19
Yeah I've also been having problems which is unusual because their network is usually so good.

Nettus
26-07-2008, 12:03
My server has been down once and thats it,

Very reasonable i thinks

v0x
25-07-2008, 23:05
As a follow up to my post, the following event on the event tracker translates as:

http://travaux.ovh.net/?do=details&id=2339

We have a problem on a map 10G on the router th2-1-6k.
The card has rebooté 3 times in 1 hour.

The connection between p19 and th2 and Sfinx.
This would seem to fit the bill.(Although I'm unaware of what this particular link serves, reboots would fit the 60-120seconds average downtime that have been happening bi-hourly over the last 18 hours)

Again, a much more detailed explanation would be appreciated - I'm also interested to know why what, as far as I can diagnose, is a fairly large network fault, has gone unaddressed/announced/dealt with for so long?

I held off calling tech support this time, next time, I won't.

v0x
25-07-2008, 22:34
I've been having problems all day, and so have the users on my net. I have been quietly livid.

When these problems occur, they affect not only my servers but OVH sites.(I know because i've been watching the weathermap, forums, and http://travaux.ovh.net/ closely)

Using a looking glass tool shows that these are not complete outages, just seems like certain 'paths' are affected.

So say, at 10:00am we'll loose the connection between our OVH Hub and server1 in the UK, but server2 in the UK and other machines will be fine. Later on, say 13:00 we'll loose server 2 but not server 1. Sometimes we'll be lucky, and just loose users.

I can provide exact times if needed.

I have three servers with OVH, and every one has been hit at some point today.

Would someone from OVH care to explain what has been going on?

Andy
25-07-2008, 18:52
Quote Originally Posted by Sean
Been having lots of downtime today, up and down quite a lot. Even ovh.net is not responding at times unfortunately.

Involved with approx 5 ovh servers(all linux) and all going up and down today
Perhaps OVH are updating some of their network today. I haven't seen any problems and my bandwidth graphs show no downtime.

Sean
25-07-2008, 18:49
Been having lots of downtime today, up and down quite a lot. Even ovh.net is not responding at times unfortunately.

Involved with approx 5 ovh servers(all linux) and all going up and down today

Andy
25-07-2008, 18:02
My server has been fine for the last 5 months, no random downtime at all. If your server is Windows, I suggest updating your NIC driver.

lofty
25-07-2008, 16:07
hmm

jester
25-07-2008, 16:06
yeah i get it at the weekend unplanned errors

lofty
25-07-2008, 15:44
Hi all,

has anyone noticed ovh going up and down the afternoon?

On several occasions i have lost all connectivity with ovh and my servers.

I have 2 internet connections here (different providers) and its happened on both.


Any Idea's?