The Future of Monitoring (1/2): We Won't Care About Infrastructure Anymore

 Originally published on April 19, 2019 by Greg Campion
Last updated on January 23, 2024 • 11 minute read

Anyone involved with IT knows that new technology is changing the landscape at an incredible rate. It feels like every day there is something new on the horizon that will ‘change the world’, or at least the way we think about the industry. Usually this doesn’t come to pass because that new technology never becomes truly pervasive, but in the last few years containers, the cloud and serverless have all affected such a change.

My name is Greg, a Cloud Engineer at Paessler, and this is the first of two posts about the future of monitoring. However, before we even consider monitoring, we first need to understand the future of IT Infrastructure, and that's what I'll cover in this post.

By the way, these thoughts (and more) are also covered in this interview (that's me with the beard):

IT Paradigm Shifts

Since I started working in IT as a systems administrator, there have been three major paradigm shifts when it comes to operations. The first was virtualization, where we went from having a single bare metal server running a few applications to having a single server running many virtualized “servers”. These servers were abstracted by virtualizing the underlying hardware of the server and allowing admins to run many servers on a single bare metal server. You could then connect many of these bare metal servers together and then using the Hypervisor (virtualization software) across the cluster of bare metal servers, you could load balance your virtual servers across the bare metal servers. This way, you were able to get a more balanced workload across many servers with less initial investment.

The second shift has been the advent and overwhelming adoption of containers. Containers operate similar to virtualization, but it takes the abstraction to the next level. Instead of just virtualizing the hardware and running full blown operating systems on each VM (which has always been a pain to keep updated and running) containers run on top of the operating system of a host or node. This means you have many workloads running on top of a single operating system.

These nodes / hosts don’t have to be on bare metal, they could also be VM’s, but the idea is that you have one “server” able to run many containers. The ability to balance your workload over those servers becomes more efficient because instead of moving the entire OS and application, you are just moving or creating new instances of the application, which has a smaller footprint.

The last, most recent shift is towards serverless. This has happened because containers allow for one more level of abstraction: Functions as a Service or FaaS, sometimes also called serverless because it eliminates the need for someone within your organization to maintain a server. This doesn’t mean that there isn’t a server somewhere running your function; it’s just that someone else is making sure that it runs.

FaaS allows software developers to write only their business logic and then upload it to a FaaS service, either with a public cloud provider like AWS or Azure, or something like Kubeless or OpenFaaS. They can then set up an event-driven architecture to run said business logic, and that’s it: you’re done! The operation of the servers for the containers and the operation of the container’s orchestration is completely abstracted away, leaving you to focus on your application’s development instead of worrying about how you are going to run it.

There is a rather heated debate about what is better, serverless or containers, but luckily, that’s not what my boss asked me to write about. What I’ll be talking about here is the resulting paradigm shift that is occurring within the world of monitoring (seeing as that’s kind of what we do here).

When We Don't Care About Infrastructure

Due to the abstraction away from hardware and the ephemeral nature of modern applications, within the next few years we won’t care about infrastructure anymore. Now that’s obviously a bit inflammatory and divisive (as all click bait quotes have to be nowadays) but when you think about it, it really does make sense. The more we remove ourselves and our applications away from the bare metal, the less we should have to care about it.

If you are running a totally serverless application on some public cloud, not only do you not care about the infrastructure behind it, you couldn’t monitor it even if you wanted to. There’s no way to access the metrics from the network or servers or containers that are running your code. In this case, what you want to monitor is the performance of your code itself.

In the case of containers, if you are a DevOps team running your application in containers across a well-built Kubernetes cluster or a managed cluster running in the cloud, you shouldn’t have to think about the hardware that’s running it underneath either. More and more, the management of K8’s clusters or similar is ‘outsourced’ to the cloud or another team, and neither the hardware underneath these managed clusters nor the clusters themselves are of any real concern to the department running the application.

The reason that outsourcing makes sense is that with the abstraction of computing, hardware and the operation and maintenance of hardware become more of a commodity. A generic set of hosts running generic container orchestration software can run just about any type of application workload. And the more you do this, the cheaper you can do it for. The reason running things in the cloud has become so cheap and therefore so pervasive is that cloud providers can run the hypervisor or container software at scale for millions of users much more efficiently than a single organization.

How Do We Monitor This New Architecture?

The question then arises ‘How should we monitor?’ This can be a complicated question and is dependent on what your business does. But when it comes to applications and workloads that are running on modern infrastructure, we now have to look past what’s running that workload and focus on instrumenting the applications being run.

Lately a term has been popping up in an attempt to encompass this idea: Observability.

Like DevOps, this term’s definition is hotly debated, but the overarching idea is that alongside what we think of as traditional monitoring, observablility also includes metrics, logs and traces (the three pillars of observability). These are directly pulled or pushed from our workload or application so that we can analyze and troubleshoot it on the fly. With this data, we are then able to infer the current state of a system from its external outputs and have context in order to understand its state.

High cardinality in our monitoring data used to be a non-pattern and something that everyone tried to avoid. However, to make an application observable, many argue that storing highly cardinal data is a must, in order to delve into problems when they occur. This allows the person who is running the system to ask specific questions with the data they are collecting in order to find a solution.

What Exactly is Observability?

Now that we understand the underlying trends and concepts that are shaping the IT infrastructure of the future, and we have a basic definition of observability, we can start to delve into the details of observability and what it means. I will cover that in a future post. While you wait for that one, let's discuss my thoughts on the future of IT in the comments below. Are my predictions realistic? Or am I way off course? Start the discussion!