OVHcloud Bare Metal Cloud Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#5708 — rps1a/1b
Scheduled Maintenance Report for Bare Metal Cloud
Completed
We have an undefined communication problem
between diffrent equipments which are connected on
rps-1a and rps-1b . It does not ping well.


We are going to restart rps-1a and when it will be up
we are going to reboot rps-1b .

Update(s):

Date: 2011-08-22 23:02:02 UTC
sup10 up.

This supervisor (sup-9)
-----------------------
Redundancy state: Active
Supervisor state: Active
Internal state: Active with HA standby

Other supervisor (sup-10)
------------------------
Redundancy state: Standby

Supervisor state: HA standby
Internal state: HA standby



Date: 2011-08-22 23:01:40 UTC
We have again the \"usual\" errors of the SUP Nexus 7016.

Aug 22 17:14:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:14:23 CEST: %IPFIB-SLOT3-2-FIB_TCAM_RESOURCE_EXHAUSTION: FIB TCAM exhausted
Aug 22 17:14:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:14:23 CEST: %IPFIB-SLOT3-4-FIB_TCAM_PF_INSERT_FAIL: FIB TCAM prefix insertion fail
Aug 22 17:14:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:14:23 CEST: %IPFIB-SLOT3-2-FIB_TCAM_RESOURCE_EXHAUSTION: FIB TCAM exhausted
Aug 22 17:14:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:14:23 CEST: %IPFIB-SLOT3-4-FIB_TCAM_PF_INSERT_FAIL: FIB TCAM prefix insertion fail
Aug 22 17:15:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:15:23 CEST: %IPFIB-SLOT3-2-FIB_TCAM_RESOURCE_EXHAUSTION: FIB TCAM exhausted
Aug 22 17:15:41 rbx-97-n7.routers.ovh.net : 2011 Aug 22 17:15:23 CEST: %IPFIB-SLOT3-4-FIB_TCAM_PF_INSERT_FAIL: FIB TCAM prefix insertion fail

remains OK.

Date: 2011-08-22 23:00:39 UTC
BGP up: 131K routes. remains OK.

We are waiting that both cards SUP are synchronised.


Date: 2011-08-22 22:59:41 UTC
Reboot of sup 10: remains OK.

Thus the mess is good.
we are going to put back the BGP to check whether is due to that.

Date: 2011-08-22 22:57:14 UTC
The problem is due to Cisco Nexus 7016

the problem seem to happen between IP not of the same vlan which are connected on the same card (here the card 3 N7K-M132XP-12) which is not an XL card.
Then it would be a mess thereafter a hot update (ISSU) or/and in fact we have cut the BGP with 100K routes and it hasn't been well carried. it is only when we reset all in hard reboot then it's back.

We are going to put back the BGP and with that 100K routes and we will see whether the problem reappears.
We will see already whether the card 9 is good or not.


Date: 2011-08-22 19:19:44 UTC
reboot of 2 rps-1x-n5: always the same problem
we unconfigure the useless vlan : always the same problem
we unconfigure the HRSP in the N7 : always the same problem
We take off the BGP of N7 : always the same problem
We reboot the sup 10 the 9 takes the relay : always the same problem
We reboot the the sup 9 , the 10 takes the relay : YES ! it pings !

Nasty bugs everywhere !
Posted Aug 22, 2011 - 19:14 UTC