rssLink RSS for all categories
 
icon_red
icon_green
icon_red
icon_red
icon_blue
icon_green
icon_green
icon_red
icon_red
icon_red
icon_orange
icon_green
icon_green
icon_green
icon_green
icon_blue
icon_green
icon_orange
icon_red
icon_green
icon_red
icon_red
icon_green
icon_red
icon_red
icon_red
icon_red
icon_orange
icon_green
 

FS#812 — FS#4847 — HG, under windows

Attached to Project— Dedicated servers
Incident
RBX2
CLOSED
100%
We have some HG, apparently under windows that does not ping
since 6h36. We continue to seek the origin of the problem.


Date:  Thursday, 18 November 2010, 16:41PM
Reason for closing:  Done
Comment by OVH - Thursday, 18 November 2010, 16:16PM

We have tried a different re-configuration of the port. it does not
work. We have recovered a server by changing the switch port.
It seems that it is a bug in the switch system.
We will see if we can recover the servers by restarting the
switch.


Comment by OVH - Thursday, 18 November 2010, 16:18PM

Same thing.

We will therefore change the ports for the 7 HG servers under Windows
which no longer function.


Comment by OVH - Thursday, 18 November 2010, 16:19PM

It does not work.

We will update the switch in order to see if it will fix the problem.


Comment by OVH - Thursday, 18 November 2010, 16:28PM

We will restart the switch.

Meanwhile, we have looked internally for similar problems
and apparently we had problems on the linux on 10G. we
had to introduce specific procedures in order to run
the linux with the choice of SFP+ cables and the network cards
due to incompatibilities. We did not have this problem
under windows.

Thus, we will see at the same time if this problem is not the same
under linux but this happens many times after the introduction of
windows and under a network. very weird.

The boot of the switch has started.


Comment by OVH - Thursday, 18 November 2010, 16:28PM

sw-n5-14.242# install all kickstart bootflash:n5000-uk9-kickstart.4.2.1.N1.1.bin system bootflash:n5000-uk9.4.2.1.N1.1.bin

Verifying image bootflash:/n5000-uk9-kickstart.4.2.1.N1.1.bin for boot variable "kickstart".
[####################] 100% -- SUCCESS

Verifying image bootflash:/n5000-uk9.4.2.1.N1.1.bin for boot variable "system".
[####################] 100% -- SUCCESS

Verifying image type.
[####################] 100% -- SUCCESS

Extracting "system" version from image bootflash:/n5000-uk9.4.2.1.N1.1.bin.
[####################] 100% -- SUCCESS

Extracting "kickstart" version from image bootflash:/n5000-uk9-kickstart.4.2.1.N1.1.bin.
[####################] 100% -- SUCCESS

Extracting "bios" version from image bootflash:/n5000-uk9.4.2.1.N1.1.bin.
[####################] 100% -- SUCCESS

Notifying services about system upgrade.
[####################] 100% -- SUCCESS



Compatibility check is done:
Module bootable Impact Install-type Reason
------ -------- -------------- ------------ ------
1 yes disruptive reset Reset due to single supervisor



Images will be upgraded according to following table:
Module Image Running-Version New-Version Upg-Required
------ ---------- ---------------------- ---------------------- ------------
1 system 4.1(3)N2(1) 4.2(1)N1(1) yes
1 kickstart 4.1(3)N2(1) 4.2(1)N1(1) yes
1 bios v1.3.0(09/08/09) v1.3.0(09/08/09) no
1 power-seq v1.2 v1.2 no


Switch will be reloaded for disruptive upgrade.
Do you want to continue with the installation (y/n)? [n] y

Install is in progress, please wait.

Setting boot variables.
[####################] 100% -- SUCCESS

Performing configuration copy.
[####################] 100% -- SUCCESS

Module 1: Refreshing compact flash and upgrading bios/loader/bootrom/power-seq.
Warning: please do not remove or power off the module at this time.
Note: Power-seq upgrade needs a power-cycle to take into effect.
On success of power-seq upgrade, SWITCH OFF THE POWER to the system and then, power it up.
[####################] 100% -- SUCCESS

Finishing the upgrade, switch will reboot in 10 seconds.
sw-n5-14.242#
Broadcast message from root (Thu Nov 18 10:26:57 2010):

The system is going down for reboot NOW!
2010 Nov 18 10:26:57 sw-n5-14.242 %KERN-0-SYSTEM_MSG: writing reset reason 31, - kernel


Comment by OVH - Thursday, 18 November 2010, 16:30PM

The switch is up-to-date. It does not work.

Now, there is still the hardware problems. We will intervene to change
the hardware.


Comment by OVH - Thursday, 18 November 2010, 16:30PM

The servers push well the MAC on the network, but it does not function.


Comment by OVH - Thursday, 18 November 2010, 16:33PM

53 windows in the racks 27XXX on the network in question,
there are only 18 which do not function. They use dhcp
to boot.

We will change the network cards of one of the servers to see if it will
fix the problem.


Comment by OVH - Thursday, 18 November 2010, 16:40PM

The origin of the problem was found. Tonight, the teams which
take care of the introduction of new servers has put in place
the new HG servers. They have taken by mistake the IP
of the DHCP servers. This has caused the crashing of all of the HG servers
which use DHCP.

The lack of communication between the internal teams in the same
data centre is at the origin of this problem. We will fix
this communication problem. We will introduce a DHCP
external to the network. Then, we will refund the customers impacted by
the crash.