Get webhook notifications whenever Network & Infrastructure creates an incident, updates an incident, resolves an incident or changes a component status.
We have had an issue with a UPS in Strasbourg. The power supply to some racks was cut for a few minutes.
Update(s):
Date: 2013-11-19 23:28:46 UTC Thereafter deep investigations, it turned out that 2 failures have gotten us to this situation:
- During the last intervention on UPS2, the UPS manufacturer has made an error / oversight on a bit of the UPS setting. This bit seems being at the origin of the UPS output cutoff. No explanation other than the human error by the manufacturer.
- During the setting of the temporary generator, the electrician installer didn't check the tension level issued by the temporary generator. This was sensitively out of range for 2 minutes, which couldn't cause any issue if the UPS was properly set...
The UPS setting issue is fixed.
The generator's tension level check will be looked after tomorrow (setting into production the temporary generator has been postponed for tomorrow).
Date: 2013-11-19 17:58:40 UTC The manufacturer has been on site for 2 hours and is continuing to extract the logs and try and understand with our teams.
It seems that the outage was intervened the second we switched onto the temporary generator.
Date: 2013-11-19 17:54:50 UTC 42 more servers down, a good part of which are in check filesystem
Date: 2013-11-19 11:08:19 UTC During deployment of a new generator in Strasbourg (http://status.ovh.co.uk/?do=details&id=5796), we conducted a temporary generator test. The UPS 2 shut down during this test.
We don't yet have an explanation for this shutdown, we are making enquiries with the manufacturer.