OVHcloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#12043 — var-1-6k
Incident Report for Network & Infrastructure
Resolved
There is currently an electrical maintenance in progress in the building.
The router was impacted during this maintenance.

It pings again now, but is not yet fully running.

CPU is 100%
CPU utilization for five seconds: 99%/5%; one minute: 99%; five minutes: 70%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
575 202948 2182 93010 53.94% 49.81% 36.65% 0 BGP Router
330 42060 648 64907 13.38% 12.23% 7.63% 0 IP RIB Update


Other electrical maintenances are planned in the building during this week: Tuesday / Wednesday and Thursday.

Update(s):

Date: 2014-11-18 18:48:34 UTC
Traffic is back to normal.

Date: 2014-11-18 18:47:53 UTC
We found the cause: the card did not have 1GB of RAM for the DFC (only 256MB)
So there was not enough available memory to program CEF entries.
We are putting traffic on the interfaces.

Date: 2014-11-18 11:33:55 UTC
The card that was replaced yesterday is causing problems, the traffic seems that is not being forwarded correctly.
I have shut the ports and we are investigating with Cisco.

Date: 2014-11-18 09:28:20 UTC
Maintenance is completed, but the links are still down at Level3 between var and ams.
A ticket has been opened.

Date: 2014-11-18 09:26:51 UTC
tt the lines to Fra are currently up

var-5-6k-SUP2T#sh int status | i fra
Te1/5 1|fra-1:t7/3(int:4 connected routed full 10G
10Gbase-LR
Te1/6 1|fra-5:t5/3(int:7 connected routed full 10G
10Gbase-LR
Te2/3 2|fra-1-6k(int:127 connected routed full 10G
10Gbase-LR
Te2/4 2|fra-5-6k(int:127 connected routed full 10G
10Gbase-LR
Te2/5 1|fra-1:t7/4(int:5 connected routed full 10G
10Gbase-LR
Te2/6 1|fra-5:t7/3(int:7 connected routed full 10G
10Gbase-LR
Te3/3 2|fra-1-6k(int:127 connected routed full 10G
10Gbase-LR
Te3/4 2|fra-5-6k(int:127 connected routed full 10G
10Gbase-LR
Po3 2|fra-1-6k(int) connected routed a-full 10G
Po4 2|fra-5-6k(int) connected routed a-full 10G
Po6 1|fra-1-6k(old_int connected routed a-full 10G
Po7 1|fra-5-6k(old_int connected routed a-full 10G

Only the link var-1 to Amsterdam remains an issue.

Date: 2014-11-18 09:24:36 UTC
Some of the links to Frankfurt are back up.

Date: 2014-11-18 09:23:20 UTC
The 2 routers are back online.
Var-1 was reload.
var-5 wasn't reloaded.

There must be a problem with our provider:
The link between var-5 and Frankfurt is down via Interoute and our link between ams-1 and var-1 via Level3 is also down.

Date: 2014-11-18 09:17:34 UTC
New electrical maintenance in the building var-1 and 5 are
unreachable.
CDN Warsaw is impacted.

It is going to be a long night in Warsaw.

Date: 2014-11-18 09:16:12 UTC
The card has been replaced.

However, we have a problem on one of the links between var-1-6k and 5-6k
No CRC error yet the traffic is not flowing normally when this port is in the bundle.

We shut the port, we will troubleshoot with the data center tomorrow morning.

Date: 2014-11-17 14:24:13 UTC
Module 3 is out of service.

var-1-6k#sh mod | i Pwr
3 001d.70aa.3874 to 001d.70aa.3877 2.7 12.2(14r)S5 12.2(33)SXI1 PwrDown
3 Distributed Forwarding Card WS-F6700-DFC3BXL SAL1426LAE1 5.6 PwrDown

Nov 17 11:45:06.019: %SYS-DFC3-5-RESTART: System restarted --
Nov 17 11:45:06.023: DFC3: Currently running ROMMON from S (Gold) region
Nov 17 12:45:13 GMT: %EARL_L3_ASIC-DFC3-3-INTR_WARN: EARL L3 ASIC: Non-fatal interrupt Adj. table interface interrupt

Nov 17 12:45:28 GMT: %PM_SCP-SP-1-LCP_FW_ERR: System resetting module 3 to recover from error: Linecard received system exception. Errcode =
Nov 17 12:45:28 GMT: %OIR-SP-3-PWRCYCLE: Card in module 3, is being power-cycled 'Off (Module Reset due to exception or user request)'
Nov 17 12:45:28 GMT: %C6KPWR-SP-4-DISABLED: power to module in slot 3 set Off (Module Reset due to exception or user request)

An RMA in underway.
Posted Nov 17, 2014 - 01:37 UTC