Get webhook notifications whenever Network & Infrastructure creates an incident, updates an incident, resolves an incident or changes a component status.
Date: 2010-07-25 22:59:00 UTC It will be fixed with the BGP collector router which been ordered and have to arrive in 5 weeks. We will have less BGP sessions by router and only simple BGP.
Date: 2010-07-24 21:52:44 UTC We returned all sessions on fra-5. it is stable.
We believe it is memory problem and memory split since we established the security via \"london/amsterdam\" and \"paris/frankfurt\".
ldn routers, ams and fra have consumed memory because of new information and visibly we are arriving at high limits. It remains 73Mo/1Go on ldn for example, but only 53Mo non fragmented.
Date: 2010-07-24 21:04:47 UTC on ldn-1-6k in the crashinfo:
Jul 24 19:05:24 GMT: %C6K_PLATFORM-SP-2-PEER_RESET: SP is being reset by the RP
Date: 2010-07-24 20:28:12 UTC We isolated all the sessions on fra-5 and disconnect all.
we are saving the configuration then rebooting.
Date: 2010-07-24 20:26:04 UTC fra-5 is down again. it's a memory problem. we are rebooting it in hard.
Date: 2010-07-24 19:48:21 UTC We are booting card by card
fra-5-6k(config)#no power en module 2
fra-5-6k(config)#no power en module 7
fra-5-6k(config)#no power en module 8
fra-5-6k(config)#no power en module 9
Date: 2010-07-24 19:47:50 UTC Jul 24 21:32:47 40g.fra-5-6k.routers.chtix.eu 418: Jul 24 20:32:27 GMT: %C6KPWR-SP-4-DISABLED: power to module in slot 8 set off (Module Failed SCP dnld)
Date: 2010-07-24 19:47:39 UTC fra-5: some problems yet:
Jul 24 20:30:53 GMT: %TFIB-SP-7-SCANSABORTED: TFIB scan not completing. MAC string updated.
-Traceback= 40E40578 40E40904 40F1664C 40E18AD8 40E19078 40DFF760 40DFFB7C 40DFFE58 40E00AD8
Jul 24 20:31:11 GMT: %TFIB-DFC4-7-SCANSABORTED: TFIB scan not completing. MAC string updated.
-Traceback= 20F6AE38 20F6B1C4 2103E87C 20F43398 20F43938 20F2A020 20F2A43C 20F2A718 20F2B398
Jul 24 20:31:14 GMT: %TFIB-DFC1-7-SCANSABORTED: TFIB scan not completing. MAC string updated.
-Traceback= 20F6AE38 20F6B1C4 2103E87C 20F43398 20F43938 20F2A020 20F2A43C 20F2A718 20F2B398
Jul 24 20:31:15 GMT: %TFIB-DFC5-7-SCANSABORTED: TFIB scan not completing. MAC string updated.
Date: 2010-07-24 19:44:46 UTC We have removed a queue modification on the 10G in order to return the old values. We modified it this week to increase the buffers on the ports.
Apparently the router did not support correctly the option.
Date: 2010-07-24 19:39:59 UTC fra-5-6k is back. Cards are not yet properly back.
ams-1-6k is back, the same, it has yet rebooted a card.
ldn-1-6k it is a crash, we are fixing it through series cable, boot in progress
vss-2-6k the arp proxy is returned.
This is the worst backbone crash we've ever had in OVH ...
The domino effect on routers which has not rebooted a long time ago and that have a RAM split.
It is time, to establish new routers generation.
It was expected but only in September (it has to be available)
Date: 2010-07-24 19:01:56 UTC proxy arp disabled on the vss-2.
Date: 2010-07-24 18:57:53 UTC ams-1 is down. The router is just back.
Date: 2010-07-24 18:54:46 UTC We isolated fra-5.
Date: 2010-07-24 18:52:16 UTC Jul 24 20:28:13 40g.fra-5-6k.routers.chtix.eu 623150: Jul 24 19:27:53 GMT: %FIB-2-FIBDOWN: CEF has been disabled due to a low memory condition.
Jul 24 20:28:13 40g.fra-5-6k.routers.chtix.eu 623151: It can be re-enabled by configuring \"ip cef [distributed]\"
Date: 2010-07-24 18:29:54 UTC fra-5 et th1-1 are defected. Not enough CPU.
We disabled the MPLS on all the backbone.