It comes unexpectedly, is annoying and almost every administrator has experienced it: a power outage. The reasons can be many and varied: an excavator cuts a critical electrical line, a distribution point is damaged, or a problem occurs in a power substation. In such cases, private households, data centers and also production facilities are affected. In the production sector in particular, dependence on information and communication technology systems has increased in recent years due to Industry 4.0 and the Internet of Things (IoT).
Due to the increasing flood of data, data center IT services and the often-associated SLAs (Service-Level-Agreements), are nowadays heavily dependent on keeping the power flowing steadily and without interruption. It is therefore essential to not only connect all devices that are important for emergency operation to the UPS (Uninterruptible Power Supply) systems correctly, but also have them evenly distributed.
Uptime in the case of a power failure cannot be guaranteed if, for example, all components are connected, but the storage system is connected to a UPS that is under-powered. The obvious consequence: all servers are running, but at a certain point in time they can no longer access data because the capacity of the UPS is not sufficient to bridge the time until the standby power system has taken over or all servers have been shut down.
The larger the server network, the more important it is to take a close look at the emergency power supply. Air conditioning systems, access systems and alarm systems should also not be neglected. When the temperature in the data center rises, this can significantly impair operation or lead to additional failures, despite sufficient power supply to the IT infrastructure.
PRTG offers various possibilities to monitor your UPS systems. In addition to manufacturer-specific sensors, you can integrate almost any UPS into the monitoring concept using SNMP. User-defined sensors additionally allow a deeper insight beyond the standard values. Depending on your UPS, PRTG can monitor the following values, among others:
Even in the case of short power outages that can occur at night or voltage fluctuations, PRTG reliably informs you through individual notifications. Especially for smaller field offices, where for example only one server is operated with a UPS system, a power failure can quickly go unnoticed. To prevent this, you can create a notification when the value of the "Time on Battery" channel is greater than '0'. To do this, click on the "UPS Time on Battery" channel in the UPS Health Sensor and enter a value greater than '0' for "Upper Error Limit(s)":
In the next step you define the notification. To do this, click on "Notification Triggers" --> "Add State Trigger" and define the notification individually according to your needs:
Analogous to this example, you can also be notified of voltage fluctuations.
In order to be prepared for an emergency and to be able to look into the future with confidence, regular load tests are vital. In this way, it can be definitively determined whether the UPS systems and their configuration can continue to supply the systems with power for a defined period of time in the event of a power failure. In this way, both function and performance can be checked under real conditions. Regular tests also ensure routine in the event of an emergency and reveal any weak points. However, maintenance of the systems should also be carried out regularly by qualified personnel to avoid technical problems.
Do you perform regular tests under real conditions to be prepared for emergencies? We are happy to read your comments!