How to build a centralized overview for monitoring enterprise IT
Originally published on October 16, 2020 by Shaun Behrens
Last updated on October 16, 2020 • 7 minute read
One of the biggest challenges with monitoring enterprise IT — which we define as environments with over 1.000 devices — is getting a unified overview. In such large environments, you almost certainly have several monitoring servers collecting data from different parts of your infrastructure. This leads to all kinds of problems, such as alert noise or multiple monitoring tools; but the biggest issue is that it causes you to lose sight of the overview. When this happens, you can't gauge the health of your entire infrastructure at a glance anymore. But how do you bring data from multitudes of devices and sources in different locations into one centralized overview?
The answer is simple: you want a dashboard with an overview of the infrastructure, so that you can tell immediately if there are potential or current issues, and what the causes of those issues might be. But how you achieve this is not so simple. Below I'll discuss the important aspects to consider when creating a centralized monitoring overview, but this is only one part of the process; for more information, download our guide to monitoring enterprise IT.
Organizing your infrastructure
It all starts with segmenting your infrastructure. You can learn more about how to do this in the "How to monitor large IT environments" guide mentioned above.
Just by way of example, you can do it by region:
Or you can do it by functional area:
You might manage everything from one location, in which case one central dashboard providing an overall summary would make sense. Or, you might have sites administered separately, each with their own dashboards. But even in this case, you would still want to have a high-level dashboard giving you a single overview of all the locations.
Get an overview of your business services
A centralized view should be very high level. But what does this mean? Again, it depends on how you segment your network, but a good way to do this is to relate components of your infrastructure to business services. For example: your company’s E-mail service, the licensing system, or software build processes are all IT services provided by several connected bits of hardware and connectivity.
Once you’ve defined your business services, you can map the relevant parts of the infrastructure to them. Let's take the E-mail service example: the mail server, storage servers and the internet connection are the components of your network and infrastructure that you map to the “E-mail service” business service. On your centralized dashboard, you would only see the health of the E-mail service.
If a minor issue occurs – let’s say a redundant mail server has performance problems – the email service itself would not be endangered since there are failover mail servers. A notification would be sent to a team member, but there wouldn’t be an alert for the entire team – and on the centralized dashboard, the service would be green.
However, if there is a service-critical problem — maybe a crash of the core switch through which all mail data passes — then the dashboard sends an alert to the whole team and the E-mail service turns red on the centralized dashboard. At this point, you can drill down to the underlying components to see what part of the infrastructure is causing the problem.
SLA Monitoring and reporting
Arranging your infrastructure as business services not only makes it easy to get an overview, but also makes it easier to manager service level agreements (SLAs).
Large enterprises often have many SLAs in place. There are internal SLAs to ensure that the IT teams are meeting certain requirements. Then there are external, or customer-facing, SLAs for organizations that provide services to external stakeholders. For example: you might have an uptime agreement for a certain service; in this case, you need to constantly check connectivity to that service and raise an alert when it is not available. Or if you have an agreement that a certain amount of bandwidth will be available, you need to constantly measure available bandwidth and raise an alert if it becomes too low.
Structuring your business services according to the SLAs you need to track will give you a better overview of the status of the services you are providing and – if there is an issue – let you drill down to discover the root cause of the problem to solve it before SLAs are violated.
Monitoring enterprise IT
A centralized overview is only one aspect of monitoring a large, distributed infrastructure. To read about the many challenges you face, how to set up your monitoring concept, and how to choose the right monitoring tool, take a look at our guide to monitoring enterprise IT by clicking the banner below.