Why monitor Unified Communications quality?

 Originally published on April 01, 2021 by Guest Author
Last updated on April 01, 2021 • 10 minute read

VQM (voice/video quality monitoring) is an advanced monitoring methodology used to mitigate and resolve performance-affecting issues before they become catastrophic. VQM works by leveraging both active (synthetic) and passive (live call/session) testing and analysis. With typical premise-based Unified Communications (UC) implementations, voice and video conferencing application traffic travels over the organizations existing LAN/WAN infrastructure.

As with any other application data, network congestion can delay voice packets or even cause some to be lost along the way. While this is just a minor inconvenience for most data applications – e.g., an email taking slightly longer to process - even a few lost RTP (media) packets can introduce unacceptable echo or disruption in a conversation. End-users have become accustomed to the nearly flawless voice quality and will not settle for lower quality, in spite of the massive cost savings on the back-end provided by UC services.

VQM solves these problems by analyzing live/in-service sessions, taking detailed notes of session state, start & end times, called party and configuration information as well as a robust suite of underlying service-affecting fault details. In addition, synthetic voice traffic generation efforts ensure that service-affecting issues are identified during both on and off-hours, constantly monitoring infrastructure state and reporting on quality. This approach ensures that any underlying issues are caught by the VQM tool, before end-users even notice service degradation.

Active Monitoring vs. Passive Analysis

To break it down into layman’s terms, there are effectively two schools of thought:

  1. Active Monitoring: application monitoring occurs by sending synthetic voice traffic back and forth between software agents and reporting on its quality. Think of this as the “secret shopper” model. It’s imperative that these agents not only send real media (RTP) but also support registration and signaling to allow them to effectively become valid addressable endpoints on your network. They can then exercise not just the media plane, but also signaling infrastructure to provide a complete view into back-end performance and reliability. This enables organizations to become aware of the issues before the end user reports them.
  2. Passive Analysis: Passive analysis describes the process of analyzing live traffic or in-service calls. Using this model, VQM is able to effectively monitor both the signaling and media planes to check for issues with any sessions traversing the enterprise network, identify any common threads and ideally alert operations personnel immediately. This lets them effectively address identified issues to minimize impact to the business.

Clearly, both approaches are complimentary and provide value, ensuring that any issues that arise are identified immediately. It’s through a combination of these two approaches that Telchemy’s technology tests and analyzes every session, allowing network and application administrators to know if a problem is occurring and its source.

Disruptions in session quality vary in severity and may present themselves as distorted audio and video, cutting in and out, and more. Have you ever had a phone call with a coworker or customer where the quality was so bad that you spent the entire call asking them to repeat themselves? Or a phone call where you don’t hear anything? Or worse, you can hear them, but they can’t hear you?

Problems like these can seriously impact your ability to deliver great customer service. While internal calls are one thing, if end-users or customers have problems calling you, they may just quit calling.

There are three basic categories of performance-related problems that can occur in enterprise IP Telephony:

  1. IP network problems, such as jitter, packet loss, and delay
  2. Equipment configuration & signaling problems
  3. Analog/TDM interface problems, which include things like echo (signal reflection) or signal level

Network architects and managers need to address call quality and performance management problems during the planning and deployment phases, but should be aware that these problems also occur regularly during normal day-to-day network operation post-deployment.

Many quality-related UC problems are short-lived, temporary in nature, and can occur anywhere along the network path. For example, a user accessing a file from a server or a home-based worker looking at a YouTube video may cause a temporary and brief bottleneck.

This can cause short-term degradation in call quality for other users on the network. Thus, it’s important that network managers use performance management tools, such as those provided by VQM, that are able to detect and measure these types of network impairments.

prtg-telchemy-unified-comms-status-01

In addition, the temporary nature of IP delivery issues also means that these problems are not easily detected or easy to reproduce. Problems are not necessarily associated with specific cables or line cards – they can occur randomly due to “collisions” or combinations of several different factors.

Network managers could attempt to use packet loss and jitter metrics to estimate call quality, but this approach is reactive, doesn’t actually correlate to the end-user experience and even more importantly doesn't provide enough diagnostic information alone to actually determine the cause of the problem.

Paessler AG has partnered with Telchemy, the premiere manufacturer of unified communications management solutions, to bring VQM into PRTG Network Monitor. The integration of Telchemy’s SQmediator application-level data makes PRTG even more unique in the industry by not only providing insight into UC application performance, but also into the underlying network infrastructure that is responsible for delivering application traffic in a timely and reliable manner.

UC metrics classification

Assuming the signaling plane is operational and sessions are established properly, the resulting UC metrics can best be segregated into three classes:

  • MOS scores
  • Degradation factors
  • Acoustic and network metrics / impairments

prtg-telchemy-unified-comms-status-02

Troubleshoot VQM in 6 steps

Proper troubleshooting typically follows the following order of operation:

  1. Check session MOS scores – these are typically reported both per-session as well as per-direction. When scores are significantly degraded, examine calls placed from the location in question over a specific time period to quickly identify the common degradation factor.
  2. Further correlate session degradation factors to determine where to head next. Degradation sources can be used to effectively isolate session impairments to help guide troubleshooting efforts.
  3. Using the results from step 2, dig deeper into the tool to extract per-session & per-direction statistics to gain a better understanding of the underlying issues as perceived by the measurement point/end-user. SQmediator provides expert analysis on a per-call basis to assist with effectively interpreting low level diagnostics.
  4. As a final step, compare UC sessions that traversed similar paths during the same time period to confirm whether other sessions experienced similar symptoms and performance.
  5. Using these results, take corrective actions to rectify network and application performance.
  6. Continue to monitor incoming performance data points, both active as well as passive, to verify the identified issue has been fixed.

Are you experiencing unified communications quality issues with your current system? Paessler and Telchemy can help. Contact us today to learn more.

iAbout the author: Anthony Caiozzo is a self-starting, perpetually motivated professional with almost three decades of broad experience installing, troubleshooting, building, promoting and selling technical products and services.
Experienced in all stages of the product life cycle, Anthony is familiar with a broad range of telecom and Internet technologies, and in defining win/win partner relationships. Anthony specialized in translating technical jargon and complexity into easy-to-understand content, helping people worldwide learn about the ins and outs of their communications systems and managed services.
When Anthony isn't working hard to promote his team’s achievements at Telchemy, he spends the remainder of his time with his wife, four children, dog, and seven chickens, and restoring his historic New England farmhouse.