The threat of a network outage is usually met with the response: “it’ll never happen to me”. While this reaction might feel comforting, it’s certainly not true. Even some of the largest technology companies in the world including Microsoft and eBay suffer the occasional outage. With Aberdeen Group estimating the average cost of network downtime at $161,000 per hour, you should take care to avoid a number of slip-ups during an outage to minimize the overall cost. Below we’ve assembled some of the top mistakes made during network downtime and the precautions that you and your team can take to better prepare for a network outage.
No Established Outage Response Plan
With so much at stake during a network outage, it’s easy to get caught up in a whirlwind of panic. It’s absolutely critical for IT teams to have a response plan that’s been proven to work and practiced ahead of time. Your response plan should clearly define team responsibilities including communication roles and who’s part of the escalation path. The plan should also include a post-mortem process for identifying the root caus(es) behind the outage and working to address any contributing gaps.
If you’ve ever played on a sports team, you’ll know that one of the cardinal rules for success is to prioritize communication. While it might seem like a good idea to turn away from your customers while you sort out the issue, this response is usually accompanied by a number of negative consequences. Aside from issues like longer hold times and poor customer service, it may stir the notion that your company is unreliable. Your company should make sure to implement systems to notify customers of any updates related to outage. In addition ensure that you’ve established responsibilities of both internal and external communication.
Messy Disaster Recovery
While it’s always best practice to have a disaster recovery solution in place, companies often make the unfortunate mistake architect a DR solution without considering the increased load on the secondary system during recovery. To bypass this situation, you should run capacity tests on both of your systems so you know the pattern of how performance scales with workload.
Responsibilities Spread Too Thin
Are you familiar with the Bystander Effect? It’s a social psychological phenomenon that posits that individuals are less likely to take action in the event of an emergency while others are present. It’s not that these are necessarily bad or lazy people, it’s just that they assume that someone else will take action instead or that it might not be that serious of an issue. In the event of a network outage, it’s been found that people are less likely to contribute to resolving the issue in a large group of people. Companies frequently have issues with responsibility diffusion. When individuals in a group don’t have a clear sense of who’s responsible for what chaos ensues. By using a single monitoring platform, individuals can be notified with information relevant to their responsibility.
Latest posts by Nathaniel Lewis (see all)
- Beyond the Magic Quadrant: An Introduction to Gartner’s ‘Cool Vendors’ and ‘Critical Capabilities’ Reports - February 16, 2018
- Why Traditional Network Access Control is Dead and What We Can Expect Next - February 15, 2018
- Alcatel-Lucent Releases New Line of ‘Stellar’ 802.11ac Access Points - February 13, 2018