OVHcloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#12344 — gra1-17a/b-n6
Scheduled Maintenance Report for Network & Infrastructure
Completed
We are updating the pair of Nexus to correct a a dysfunction that has recently had a strong impact on the pair (Eth_portSec HA).

The intervention will begin at 11:30 PM. There is no interruption to traffic as the update will be made hot.


Update(s):

Date: 2015-01-12 09:37:46 UTC
The two switches have been updated to the latest version available and they should be able to be updated hot. We are are going to open a case with the provider to learn why hot updates are not working correctly.

gra1-17a-n6# sh version | i sys
system: version 7.0(5)N1(1a)

Date: 2015-01-12 09:34:51 UTC
There are not anymore servers in monitoring. The switches are not being having their port configuration replicated.

Date: 2015-01-12 09:32:44 UTC
A big majority of the servers are on line, we are interveining on the rest of the machines.

Date: 2015-01-12 09:24:56 UTC
The two switches are UP with their tex, The infrastructure is stable.

Date: 2015-01-12 09:24:06 UTC
The FEXs are coming back online progressively.

FEX FEX FEX FEX Fex

Number Description State Model Serial

------------------------------------------------------------------------
100 fex100 Offline N2K-C2248TP-E-1GE
FOX1749GBA7
101 fex101 Online N2K-C2248TP-E-1GE
FOX1749GB8U
102 fex102 Online N2K-C2248TP-E-1GE
FOX1749GBA9
103 fex103 Online N2K-C2248TP-E-1GE
FOX1749GXLY
104 fex104 Online N2K-C2248TP-E-1GE
SSI173606K5
105 fex105 Online N2K-C2248TP-E-1GE
FOX1749GX9Z
106 fex106 Offline N2K-C2248TP-E-1GE
SSI173605CE
107 fex107 Online N2K-C2248TP-E-1GE
FOX1749GXLV
108 fex108 Online N2K-C2248TP-E-1GE
FOX1749GZAC
109 fex109 Online N2K-C2248TP-E-1GE
SSI173605CS
110 fex110 Online N2K-C2248TP-E-1GE
FOX1750GQW8
111 fex111|ASA Online N2K-C2248TP-E-1GE
FOX1749GXND
112 fex112 Online N2K-C2248TP-E-1GE
SSI173605LL
113 fex113 Offline N2K-C2248TP-E-1GE
SSI173606BC
114 fex114 Online N2K-C2248TP-E-1GE
SSI173605CA
115 fex115 Online N2K-C2248TP-E-1GE
SSI17360620
116 fex116 Online N2K-C2248TP-E-1GE
SSI173600A9
117 fex117 Online N2K-C2248TP-E-1GE
FOX1749GXLX
118 fex118 Online Sequence N2K-C2248TP-E-1GE
FOX1749GXLU
119 fex119 Offline N2K-C2248TP-E-1GE
FOX1750GQS1
120 fex120 Online N2K-C2248TP-E-1GE
FOX1750GQR8

Date: 2015-01-12 09:22:48 UTC
So that the second FEX can be updated hot (Non Disruptive) the last machine rebooted prematurely, the FEX was cut:

2015 Jan 12 02:43:16 gra1-17b-n6 %$ VDC-1 %$
%SATCTRL-FEX100-2-SATCTRL_ISSU_FPORT_FLAP: Nif 0x20000000 flapped
during switch ISSU
2015 Jan 12 02:43:52 gra1-17b-n6 %$ VDC-1 %$ %PFMA-2-FEX_STATUS: Fex
112 is offline (Serial number SSI173605LL)
2015 Jan 12 02:43:52 gra1-17b-n6 %$ VDC-1 %$ %PFMA-2-FEX_STATUS: Fex
115 is offline (Serial number SSI17360620)

Date: 2015-01-12 09:19:54 UTC
The last FEX is being updated.

Install has been successful.

Date: 2015-01-12 09:19:21 UTC
This time the update was successful.

Module 100: Non-disruptive upgrading.
[# ] 0%

Date: 2015-01-12 09:18:28 UTC
The second switch is being updated. Still non disruptive.

Date: 2015-01-12 09:15:59 UTC
We've tried to downgrade the system, to return to a stable state to try and update hot on the second n6.

Date: 2015-01-12 09:14:28 UTC
We've tried to update again. No servers are in monitoring.

Date: 2015-01-12 09:12:03 UTC
The install did not take effect on the FEX.

-- FAIL. Return code 0x4200000E (Image download failed on the FEX).

Remaining action::
\"Module(s) 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120 still need to be
upgraded\".

Install has failed. Return code 0x40930069 (Preload of module image
failed).
Please identify the cause of the failure, and try 'install all' again.


We are investigating.

Date: 2015-01-12 09:10:56 UTC
An error, we are waiting to see if there will be an impact.

015 Jan 12 00:38:52 gra1-17a-n6 %$ VDC-1 %$
%SATCTRL-FEX102-2-SATCTRL_IMAGE: FEX102 Image update failed
[/isan/plugin_img/fexth.bin]: File transfer error

Date: 2015-01-12 09:09:03 UTC
The first FEXs have been updated:

2015 Jan 12 00:34:03 gra1-17a-n6 %$ VDC-1 %$
%SATCTRL-FEX100-2-SATCTRL_IMAGE: FEX100 Image update in progress.
2015 Jan 12 00:34:14 gra1-17a-n6 %$ VDC-1 %$
%SATCTRL-FEX101-2-SATCTRL_IMAGE: FEX101 Image update in progress.
2015 Jan 12 00:34:24 gra1-17a-n6 %$ VDC-1 %$
%SATCTRL-FEX102-2-SATCTRL_IMAGE: FEX102 Image update in progress.

No servers are in the monitoring at this time.

Date: 2015-01-12 09:07:45 UTC

Compatibility check is done:
Module bootable Impact Install-type Reason
------ -------- -------------- ------------ ------
1 yes non-disruptive reset
2 yes non-disruptive rolling
100 yes non-disruptive rolling
101 yes non-disruptive rolling
102 yes non-disruptive rolling
103 yes non-disruptive rolling
104 yes non-disruptive rolling
105 yes non-disruptive rolling
106 yes non-disruptive rolling
107 yes non-disruptive rolling
108 yes non-disruptive rolling
109 yes non-disruptive rolling
110 yes non-disruptive rolling
111 yes non-disruptive rolling
112 yes non-disruptive rolling
113 yes non-disruptive rolling
114 yes non-disruptive rolling
115 yes non-disruptive rolling
116 yes non-disruptive rolling
117 yes non-disruptive rolling
118 yes non-disruptive rolling
119 yes non-disruptive rolling
120 yes non-disruptive rolling




Date: 2015-01-12 09:07:00 UTC
The n6 is ready

gra1-17b-n6# show system internal mts buffer
MTS buffers in use = 39

Updated!

Date: 2015-01-12 09:06:00 UTC
No more difficulties with the conf, the protocol is okay.

gra1-17b-n6# sh vpc | i fail
gra1-17b-n6#

The buffers are going down and we'll soon be able to update.

gra1-17b-n6# show system internal mts buffer
MTS buffers in use = 460

Date: 2015-01-12 09:02:21 UTC
The level of the infrastructure on the router is stable. We are waiting for the buffers to clear empty:

gra1-17b-n6# show system internal mts buffer
MTS buffers in use = 1022

Date: 2015-01-12 09:00:42 UTC
We've cut the robots so no changes can be made
while we are working on the equipment.




Date: 2015-01-12 08:52:44 UTC
A reboot was not foreseen and appeared during the backup of the configuration.

Date: 2015-01-12 08:48:42 UTC
We've reloaded the second switch so that can be updated hot. In fact at the moment, one of the licenses installed prevents this.
No traffic outage is expected since the second switch will take over .

Date: 2015-01-11 19:31:25 UTC
The whole configuration was checked. We intervene this night starting from 22:30 to update the switches.

Date: 2015-01-09 04:26:26 UTC
The first reboot highlighted a difference in configuration between both switches. We postponed the maintenance to in order to check all configurations.

Date: 2015-01-09 04:25:45 UTC
A reload is necessary to prepare the switch to make a hot swap update. No impact is expected since the load will be handled by the second switch.

Date: 2015-01-09 04:24:55 UTC
We will start the update.
Posted Jan 08, 2015 - 14:46 UTC