Being On-Call with PRTG and iLert
Apr 15, 2019 • 7 minute read
Every organization that relies on their services being always available needs to respond to incidents fast. How fast really depends on the desired availability level. Response times are constrained by the desired availability targets. An availability target of, for instance, 99.95 % allows a downtime of 21.6 minutes per month. That is, if you have one incident per month, the on-call person needs to react within minutes to every incident.
In order to accomplish availability targets you need reliable monitoring that detects issues and generates actionable alerts and alert notification to make sure that the alert gets the attention of the right individual that can fix the problem. When both work well together, monitoring and alerting tell us when a system is broken or about break without requiring someone to stare at dashboards and watch for problems to occur, like they do in traditional NOCs (network operation centers).
That being said, while monitoring and alert notification can be unified in the monitoring system, this strategy has its shortcoming when you strive for low response times and when you want to share on-call responsibility across your operations and engineering teams. Monitoring tools often lack capabilities in the alert notification and on-call management space, such as frictionless alert acknowledgement, alerting via voice call, managing on-call schedules, automatic escalations, etc. Some organizations simply send emails to the entire team for urgent alerts, which often results in nobody taking responsibility and simply ignoring the emails. Besides, email is probably the worst alerting mechanism and should never be used as the primary alert notification method.
Dedicated alerting systems extend monitoring tools with advanced alerting and on-call management capabilities. One such tool that natively integrates with PRTG is iLert - an alerting and on-call management platform for operations and engineering teams. iLert helps you to respond to incidents quickly and reduce downtimes by adding on-call schedules, SMS, push, and voice alerts to PRTG. This blog post shows how you can integrate PRTG with iLert in three easy steps:
#1 Create a PRTG Alert Source in iLert
Go to Alert sources and click Create alert source. Chose a name for the alert source, assign an escalation policy, and select PRTG Network Monitor as the Integration type. The escalation policy determines who will be alerted if an incident is created by PRTG. Escalation policies in iLert consist either of specific people to notify or may reference an on-call schedule. For example, an escalation policy might be configured to notify the person who is on-call according to the “Primary Ops Team Schedule”, then escalate to the “Secondary Ops Team Schule”, if the primary on-call doesn’t acknowledge the alert within 10 minutes, and finally, alert another backup person (such as the team lead), if both the primary and secondary don’t respond.
After saving the alert source, you will see a URL and Payload data to be used later in PRTG.
#2 Create a Notification Template in PRTG
Go to Setup > Account Settings > Notification Templates
Click on Add Notification Template and enable Execute HTTP Action. Copy the URL and Payload from step #1 and paste them into the respective fields in the HTTP Action.
Now you can use the notification template in PRTG like any other notification template and assign it to devices or group of devices.
#3 Test the integration
In order to test the integration, generate an alert in PRTG by, e.g., temporarily increasing the warning threshold of a Free Disk Space sensor. Below is a sample incident in iLert that was generated by PRTG.
The incident contains a link pointing back to the sensor in PRTG that generated the alert. Moreover, the incident in iLert will be automatically resolved once the alert in PRTG is resolved.
The integration is now fully setup. You have many options to further tweak the integration to your needs. For example, you can connect multiple alert sources and teams in iLert with PRTG, such that different teams get alerted depending on the device or sensor in PRTG that generated the alert. Feel free to drop us a message via email or chat if you have any questions.
About the Author
iLert adds on-call schedules, SMS, and voice alerts to your existing monitoring tools; it supports two way alerts on multiple channels. You can acknowledge or reject alerts on the go, by replying to an SMS, by pressing a button in a phone call, or by using one of the iLert apps. Here you find a detailed PRTG Integration Guide.