We are in the process of migrating this forum. A new space will be available soon. We are sorry for the inconvenience.

Server dying, need help please


Nettus
10-04-2008, 01:01
I have that problem only when installing windows on my server.

milanyc
10-04-2008, 00:24
Thank you very much for manually checking the server for me. Appreciate your quick response very much

Marco
09-04-2008, 14:25
I have planned an intervention for you in order to check the Hard Drive. The intervention will start at 2.20pm and you will be informed by email about the status and when it will finish.

You should contact back Oles if your server is in noSLA mode and you are not happy about it, as we don't have control over it. I can just tell you that SLA is the one without limitation. Let me know.

milanyc
09-04-2008, 12:03
Quote Originally Posted by Marco
Your account is based in France, then you should indeed refer to the French support team as they can also speak English, but in this occasion I'm willing to help you.

At the moment I can see that your server is booting on the Hard Drive and the nx server is responding. It should working fine according to the information I have. Can you tell if the server doesn't work after adding some software or anything that can lead me to the solution of the problem?

I will plan an intervention in order to control the hard drive for you, or anything you want to make sure is working good. Just tell me exactly when and what would you like to control, as the server will be down within this time.
Thank you so much Marco for helping me on this one. The only reason im not on the french forum is that I dont speak french. I have an apartment in Paris as well as my business, but my french is very poor.

I really didnt install anything special on the server, had GPROFTPD the ftp client, and most of the time used it for file storing and sharing. Sometimes I would use the torrent application. However, most of the problems came after I asked Octave to switch me to the noSLA bandwidth since I used to have good experience in that mode. I dont know what happened, but my routing changed, latency sky rocketed, speeds went to crawling. At this time I would like to ask you to switch me back to SLA mode as that might give me a little better performance.
The last whole month, since you guys disabled the SLA switch in the manager, this problem have been bothering me. But at the same time it looks like something might be wrong with the hard drive so I would be very pleased if you could look at the drive as well.

The best time would be right now, today or as soon as possible Just let me know here or via email.

Thank you so much one more time.

Marco
09-04-2008, 11:38
Your account is based in France, then you should indeed refer to the French support team as they can also speak English, but in this occasion I'm willing to help you.

At the moment I can see that your server is booting on the Hard Drive and the nx server is responding. It should working fine according to the information I have. Can you tell if the server doesn't work after adding some software or anything that can lead me to the solution of the problem?

I will plan an intervention in order to control the hard drive for you, or anything you want to make sure is working good. Just tell me exactly when and what would you like to control, as the server will be down within this time.

sledge0303
09-04-2008, 09:59
Hi,

before you start to install services at your server, you should enable 'Bootlog'. Example for debian

Code:
echo 'BOOTLOGD_ENABLE=Yes' > /etc/default/bootlogd
In Gentoo

/etc/conf.d/rc
Search for Bootlog and enable it. RC_BOOTLOG="no" is set up by default.

In case your server doesn't start, find out what may could be the reason. So you have an option #1. Reboot server in rescue, mount your HDD to /mnt, log in with 'chroot' to your system and go to '/var/log' directory.
You'll find some logs, especially 'dmesg' and 'bootlogd' or different a bit.
You can read it this way

Code:
cat /var/log/dmesg | less
Sometimes it is important to know the time stamp of that file 'dmesg'. It should be identically with time of last (re)boot with your server.

May your network just was down after reboot. Ask dmesg file for entries about with

Code:
grep eth /var/log/dmesg
In case you need some help, feel free and mail me or post parts of your logs.

milanyc
09-04-2008, 06:15
Dear Sir or Madam,

Your server has started in 'Rescue' mode; that means that
core Linux/BSD system was launched on your server through the
network. This is not the system, which is normally installed
on your server, none of your hard disks has been reached.

You can connect yourself in SSH to your (91.121.111.28)server
with the following parameters:
- user: root
- password: xxxxxx

You can now carry out the maintenance actions necessary to
the re-establishment of your server.

For example, you may:

- check and update your files of configuration network,
- check and if required decontaminate your firewall,
- check and update your LILO (or to configure another
Netboot via the network)
- launch a manual checking of your filing system,
- carry out a backup or a recovery of data,
- etc.

If you reckon you've identified the origin of the problem and
wish to start again your server normally, you must configure the
netboot to the hard disk or through an OVH-certified core:
http://help.ovh.co.uk/KernelNetboot/
Then reboot your server in soft (avoid to reboot via the Manager).

You will find additional information in our guide: http://guides.ovh.co.uk/RescueMode

Kind regards,
OVH Technical support
------------------------------------------------

Now i've received this email after i already reinstalled the server...
The root password doesnt work. the server is pingable now but i dont have admin nor root passwords working at all.

Would appreciate your help.

milanyc
09-04-2008, 06:07
Now after the reinstall i can not even connect to the server using NX Customer.

The NX service is not available or the NX access was disabled on host 91.121.111.28
NX> 203 NXSSH running with pid: 4748
NX> 285 Enabling check on switch command
NX> 285 Enabling skip of SSH config files
NX> 285 Setting the preferred NX options
NX> 200 Connected to address: 91.121.111.28 on port: 22
NX> 211 The authenticity of host '91.121.111.28 (91.121.111.28)' can't be established.
RSA key fingerprint is xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
Are you sure you want to continue connecting (yes/no)?
Warning: Permanently added '91.121.111.28' (RSA) to the list of known hosts.
NX> 202 Authenticating user: nx
NX> 208 Using auth method: publickey
NX> 204 Authentication failed.

Something is very wrong here.

milanyc
09-04-2008, 06:02
Ive been having pretty serious issues with my server for about a month. I have several posts on this forum but nothing has been done. My server is ns200552.ovh.net .
I've been warning the ovh staff that something might be severly wrong on my server, but i have been ignored.
Well, the time has come... My server is absolutely unpingable, not responsive to ssh. I've received one of the generic emails from the ovh system:
----------------------------------------

Dear Sir or Madam,

Our system of monitoring has just detected a defect on
your server ns200552.ovh.net.
This defect was noted at 2008-04-09 03:05:43

Our technicians, who work 24/7, have also received
this alert. However, they may be conducting another
intervention and we're unable to give you the precise time.

You will be informed at the beginning of the intervention
by email.

While waiting for the intervention of our teams, you
always have the possibility
of hardware reboot in the manager.

Logs:
----------------------
PING ns200552.ovh.net (91.121.111.28) from 213.186.33.13 : 56(84) bytes of data.
From 213.186.33.13: Destination Host Unreachable
From 213.186.33.13: Destination Host Unreachable
From 213.186.33.13: Destination Host Unreachable
From 213.186.33.13: Destination Host Unreachable
From 213.186.33.13: Destination Host Unreachable
From 213.186.33.13: Destination Host Unreachable

----------------------------------------

About an 1.5 hours later i've receiced this one:


Date: 2008-04-09 03:05:43 : ns200552.ovh.net detected as down
Date 2008-04-09 04:17:54, Vincent made Software diagnosis: Serveur en dd et en bzimage en erreur:
"VFS CAnnot open root device "902" or unknown block(9.2)
Please append a correct "root=" boot option
Kernel panic not syncing VFS Unable to mount root fs on unknwon block (9.2)"

serveur place en rescue pro pour reparation software
ping ok,ssh ok

--------------------------------------------

I wasnt very clear what that really meant as it wasnt fully in english. In the meantime, after that 2nd message i decided to reformat and reinstall the Ubuntu just to make the fresh start... Reinstalled it, received the new password, logged in through NX Customer, and started the ubuntu updates... in the middle of updating i lost the link, the server is absolutely not responsive to ping nor ssh...


Pinging 91.121.111.28 with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
.... for about 1hour....

I really need some help. The server is in noSLA mode on my request, and ever since i asked oles for that, the whole quality went downhill... i have been terrible ftp speeds... usually about 12-16kB/s, during the days im pinging out, lossing connection etc... I strongly believe that either the ethernet switch or the hard disk is dying, and kindly asking you guys to help. Also I would love to try back the SLA mode as this is a serious 1 month long nightmare.

I've purchased 3 month deal on this server and the last whole month the experience was not pleasant at all. I'd like to renew the contract which expires in 1 month, but if the service keeps being like this, there is no point for me to use ovh anymore

I have a feeling that by the time you guys read this and ping my server things might be ok, but please, take your time and check manually my hard disk at least as something is not right. Also I would really appreciate switching me back to SLA.

Thank you very much.