While IT and OT networks might be similar, they are not the same. Monitoring OT networks using active tools as freely as in IT networks can be an issue, or even an obstacle, that might negatively impact the stability of Industrial Automation Control Systems (IACS). Here I'll discuss how passive monitoring can be deployed in an industrial IT environment, and how it can be used in conjunction with active monitoring.
This is a guest article by Dr. Frank Stummer of Rhebo.
"A packet too far" is sometimes the cause of stability issues or even outages due to real-time requirements, or because "old" devices simply cannot handle standard pings. This is indeed something that occurs very often: In the Rhebo Stability and Security Audits last year, we found incorrectly configured asset and various network condition information measures in roughly 20% of the audited IACS networks. Most of them were unintentional (read: nobody knew about them). By the way, this unintentional network condition communication is only a subgroup of the 64% of audits where we found unnecessary services or "zombie" devices in OT networks. Sometimes, industrial networks are very much flooded with unnecessary and concerning communication.
With the rise of new business cases and production processes, OT networks are becoming bigger and increasingly complex. In fact, in many large companies in various sectors, the number of IP addresses (or connected components) in OT is already bigger than in IT. It is crucial, then, that the topics and issues mentioned above must be noted.
On the other hand, dealing with OT networks is in some regards easier than in IT. At the end of the day, machines talk to other machines in a systematic way, resulting in some action that is pre-programmed, and thus predictable. This causes a very high level of deterministic and repetitive behavior and communication, as well as a much more stable infrastructure (or in other words, there is no need to change components and network structures very often).
This very high grade of determinism gives us a huge advantage. We can use specific approaches, measures, and tools to monitor OT networks and to achieve the otherwise often contradictory objectives of cybersecurity, stability, and availability at the same time.
In consideration of these specific needs and characteristics of OT networks, a passive and non-reactive approach, as is used by Rhebo Industrial Protector, an OT monitoring and anomaly detection system, is the preferred option.
Let's have a look at a simple, but real and very common example: a welding production cell with Profinet as automation communication protocol that is passively monitored through a mirror port at a switch. Figure 1 below shows the notifications of all new "things" seen in the communication, clustered by type. Within just 30 seconds, the entire communication of one production cycle gets monitored, learned and visualized (the monitoring starts a few seconds after 10:00:00 in this example). After 30 seconds the communication (and the participants and the content) is just repeating at the same level (Figure 2).
The main advantage of this? After learning the deterministic communication, the anomaly detection can start to passively analyze future communication patterns for any deviations that might cause a disruption or downtime.
Figure 1:
Figure 2:
The production cell consists of 32 components, including one Siemens PLC as the central control device, which is highlighted in Figure 3 (shown in a circular representation of the connections on OSI layer 2/ the MAC address level; the topology is a simple star-shaped network). Of course, MAC-addresses and IP addresses (if used) of the participating devices are always detectable.
Figure 3:
But it is also possible to get more information from passive monitoring: vendor name, device name, and even order numbers and more. In many OT-specific protocols, information about the devices can be set and regularly sent (for example after each re-start or at the beginning of a production cycle). Very roughly, in most OT environments, between 50% and 60% of such basic information can be seen by passively monitoring the communication.
There will always be additional information that you need over and above what you get from passive monitoring. If you want to close this gap for network maintenance and cybersecurity reasons, a hybrid approach is possible. To avoid the aforementioned problems of unrestricted IT-like "hard" active measures, dedicated active measures can be used to augment the information already collected from passive monitoring.
As an example: dedicated active queries could be used for retrieving data from modern PLCs via OPC UA. Some other examples: data can be attained via Smart Gateways communicating with MQTT, or by using SNMP. A monitoring tool like PRTG is good for this, because it also provides functionality specific to industrial IT environments, such as support for OPC UA, MQTT, Modbus TCP, and more.
The collaboration between PRTG and Rhebo Industrial Protector is a great example of a hybrid approach. Rhebo's passively monitored and collected data on OT infrastructure, vulnerabilities, anomalies that might indicate cyber attacks, and technical error states perfectly complements the active approach of PRTG.