
Too many organizations treat network management as a "nice to have" part of their operational toolkit, rather than a "must-have" capability. You can usually get away with this for a while, but eventually your luck runs out...
Last week, I related an all-too-typical tale of woe about how a startup suffered an all-day customer-visible outage because of a network problem, explaining how network automation could have shortened the outage from hours to minutes. Well, it turns out that lack of network automation wasn't their only problem...
As it happened, at the time of the outage, they didn't have any network management capability, because their sole network management host had suffered a disk failure several days before and they hadn't gotten around to restoring the host yet because it was "just the network management system".
Unfortunately for them and their customers, the failed system that was "just the network management system" would have:
In retrospect, I'm sure they wish that they had engineered "just the network management system" with the same level of service reliability as their customer-visible "production" systems. I'm sure they wish that they had treated the failure of "just the network management system" with the same sort of urgency as they would a failure of one of their customer-visible "production" systems.
Once the network management system failed, they were living on borrowed time. When something else failed (i.e., the ethernet switch), they were severely hampered in their ability to detect and deal with that failure, which resulted in an extended customer-visible outage. Even though the network management system isn't itself customer-visible, it is an essential part of providing a reliable service, and needs to be treated as such.
Netomata can help you avoid problems like this with your network, while making your network more cost-effective, reliable, and flexible; please contact us to discuss how.