OVHcloud Private Cloud Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#9551 — pcc-1a/b-n7
Scheduled Maintenance Report for Hosted Private Cloud
Completed
We're going to upgrade the supervisor cards of the 2 Nexus 7000 by passing a SUP-1 to a SUP-2.

This upgrade will enable routing on the Nexus.

The maintenance works will begin at 00:00 (CEST) This tuesday October 21st.

Update(s):

Date: 2013-10-24 15:21:54 UTC
All done, we're in 6.1. We 'll talk to Cisco
to find out why it crashed.


Date: 2013-10-24 13:53:19 UTC
Go:
We're cutting all the ports on A.
Traffic will flow via B.
No downtime expected.

Date: 2013-10-24 13:52:17 UTC
We're shutting down all ports on chassis A, which is in 6.2,
so that the traffic flows only via 6.1.

We will update A to move it onto 6.1 (hot or cold swap).

We will put get all ports back up and everything
should work as before the update.

We will then launch the instant update of B onto
version 6.2, and we'll see if it still crashes on the
2nd supervisor. If so, we will let Cisco fix the problem for us
however long it takes.

Date: 2013-10-24 13:33:39 UTC
oles: I'm taking over the maintenance.

Date: 2013-10-24 07:45:36 UTC
There were some incidents during the switchover. Some of the N5s stopped forwarding the traffic onto pcc-1a-n7.

We have stopped the update for tonight in order to investigate this N5 forwarding issue. p

Date: 2013-10-24 00:14:55 UTC
We switched traffic on pcc-1a-n7 in order to upgrade pcc-1b-n7.

Date: 2013-10-23 22:23:47 UTC
Traffic was resumed. Some ports are down.

Date: 2013-10-23 22:22:45 UTC
N7 is up. We are resuming traffic.

Date: 2013-10-23 22:21:32 UTC
N7 is up. Linecards booted.

Date: 2013-10-23 22:21:00 UTC
We rebooted it.

Date: 2013-10-23 22:20:46 UTC
Traffic is switched.

We will start the update.

Date: 2013-10-23 22:20:06 UTC
We will update switches cold swap.

We will switch traffic on pcc-1b-n7 in order to upgrade pcc-1a-n7.

Date: 2013-10-23 05:40:24 UTC
The new card couldn't be synchronised.

We are contacting the supplier.

pcc-1a-n7# 2013 Oct 23 04:27:17 pcc-1a-n7 %SYSMGR-2-GSYNC_SNAPSHOT_SRVFAILED: Service \"spm\" on active supervisor failed to store its snapshot (error-id 0x4048000C).
2013 Oct 23 04:27:17 pcc-1a-n7 %SYSMGR-2-STANDBY_BOOT_FAILED: Standby supervisor failed to boot up.

Date: 2013-10-23 05:39:42 UTC
The new card is inserted. It is being synchronised.


Date: 2013-10-23 02:14:56 UTC
We have received the new card. We are replacing it.


Date: 2013-10-23 01:02:49 UTC
The card is failing. The supplier has sent a new one.


Date: 2013-10-22 23:56:54 UTC
One of the supervisor cards did a reboot loop.

We are suspending the update.

We are opening a ticket with the supplier.


Date: 2013-10-22 22:54:05 UTC
The update starts on pcc-1a-n7.

1 yes non-disruptive rolling
2 yes non-disruptive rolling
3 yes non-disruptive rolling
4 yes non-disruptive rolling
5 yes non-disruptive reset
6 yes non-disruptive reset
7 yes non-disruptive rolling
8 yes non-disruptive rolling
9 yes non-disruptive rolling
10 yes non-disruptive rolling


Date: 2013-10-22 22:53:36 UTC
We are starting the update.

Date: 2013-10-22 22:53:18 UTC
We will update the N7 in order to see if this fixes the issue.

We are starting the maintenance at 00:00 (CEST).

Date: 2013-10-22 06:25:31 UTC
Following the maintenance, we have noticed an issue with the vlan management of some hosts (a twenties). Thereafter troubleshooting, it doesn't seem like the network is at the origin of the issue. There's actually a board effect due to the maintenance which has an impact on these hosts. We are carrying on the troubleshooting along with the VMware.


Date: 2013-10-22 01:22:38 UTC
Upgrade done

Date: 2013-10-22 00:41:07 UTC
The N7 has restarted. We are re-applying the setting.


Date: 2013-10-22 00:40:35 UTC
The N7 is switched off.

We are switching the cards. The traffic is flowing via pcc-1a-n7.


Date: 2013-10-22 00:39:54 UTC
Done

We are replacing the cards.

Date: 2013-10-22 00:39:40 UTC
We have switched the optics and the ports are up.

We are switching the traffic on pcc-1a-n7 in order to upgrade pcc-1b-n7.



Date: 2013-10-21 23:27:59 UTC
The N7 is performing.

Some ports are down. We are looking up for the reason before performing the other switch.

Date: 2013-10-21 23:26:53 UTC
The N7 has started. We are re-applying the setting.


Date: 2013-10-21 23:26:29 UTC
Cards were replaced. We are starting.


Date: 2013-10-21 23:26:08 UTC
The N7 is switched off.

We are switching the cards. The traffic is flowing via pcc-1b-n7.

Date: 2013-10-21 23:25:30 UTC
The traffic is switched.

We are suspending N7 to perform the upgrade.


Date: 2013-10-21 23:24:36 UTC
We are starting the maintenance.

We are switching the traffic on pcc-1b-n7 in order to upgrade the pcc-1a-n7.
Posted Oct 21, 2013 - 07:49 UTC