Paessler Blog - All about IT, Monitoring, and PRTG

PRTG & The Exchange Admin (Part 1/6): 8-Out-Of-The-Box-Sensors That Will Save Your @ss

Written by Simon Bell | Jun 19, 2017

Part 1 – Exchange Sensors Included In PRTG

In this, the first in a series of articles looking at how PRTG can help Exchange Admins to manage their systems, we look at the continued popularity of email as a corporate communications tool. We’ll also see how PRTG’s pre-defined Exchange sensors can provide a great overview of system health and performance. In subsequent articles, we’ll see how custom sensors can provide even deeper insight into the many components and sub-systems that make up an Exchange infrastructure.

The death of email has been predicted many times in recent years, with various justifications – instant messaging is better, spam makes email insecure and inefficient, social media is cooler, people change addresses too often. However, as is often the case, the facts do not agree with pundit’s opinions, as the Email Statistics Report 2015-2019 by The Radicati Group shows:

 

2015

2016

2017

2018

2019

Global Email Accounts (Mio)

4353

4626

4920

5243

5594

% Growth

 

6%

6%

7%

7%

 

 

2015

2016

2017

2018

2019

Global Daily Email Messages (Bn)

206

215

225

236

247

% Growth

 

5%

5%

5%

5%

 

So, far from becoming extinct, the prevalence and popularity of email continues to increase. The move towards cloud based services is starting to change the way organisations provision their email services. Research from Gartner shows that around 13 percent of publicly listed companies have already moved their email into the cloud.

But this means that most organisations are still using on premise email services and while surveys differ in the precise market share, they do all agree that Microsoft Exchange is still the clear market leader when it comes to business email systems.

 

Since its initial release in 1996, Exchange has evolved from a relatively simple X.400 based messaging system, into a complex application that provides many features -

  • Email Send & Receive via MAPI, IMAP, POP3, SMTP Protocols
  • Meeting, Appointment and Resource Scheduling
  • Contact Management
  • Task Management
  • Collaboration & Shared Folders
  • Spam & Virus Filtering and Protection
  • Mobile Device Synchronisation
  • Web Based Access

In turn, these features rely on many aspects of the IT infrastructure – servers, network, storage and the rest must all be performing optimally for Exchange to fulfil its function as the primary communications tool for most organisations. This is where PRTG can make an Exchange Admin’s life easier, by ensuring that all the supporting infrastructure, and the Exchange system itself, is healthy and performant.

In subsequent articles, we’ll look at how PRTG’s Custom Sensors can be used to “deep-dive” into the health of the Exchange system, what metrics we should be monitoring, and some of the performance danger signs to look out for. But to start with, let’s take a look at PRTG’s out-of-the-box Exchange sensors.

Before we get into the specifics of the individual sensors, a word about thresholds / limits. Where possible, I’ve tried to give guidance about the “healthy” values you should look for from these sensors, but for many of them the performance figures will vary across deployments – An Exchange system serving a 10-person company will perform very differently to one in a global corporation or multi-tenant, MSP environment.

This is why baselining is important when setting up a new monitoring system or adding new systems & devices. After installation, it’s a good idea to leave PRTG gathering data for a week or two, before setting thresholds & notifications. That way, you can get a feel for what is “normal” performance in your environment, and you can use this data to set your thresholds accordingly. One of the most common reasons for monitoring projects failing is the lack of baselining before activating notifications. Support teams quickly learn to ignore floods of false-positive notifications coming from a badly tuned monitoring system and this inevitably leads to real, critical alerts being missed.
So, with that advice in mind, let’s take a look at the Exchange specific sensors that are included with every PRTG license, including the 100 sensor freeware version.

WMI Exchange Server Sensor

This Is a great general purpose Exchange sensor. Adding this to a server will allow you to choose from over a dozen different metrics that report on the health and performance of many of the key components of Exchange. The specific sensors available will vary based on the roles and configuration of the server it is assigned to.

Metric

Description

Recommendation

Queue size

The number of messages waiting to be processed in the message queues

The lower the better, ideally 0

Average delivery time

The average time in seconds between the submission of a message to the public folder store and submission to other storage providers for the last 10 messages.

The lower the better

Logon operations per second

Shows the number, per second, of mailbox store logon operations.

N/A

Sent, delivered, and submitted messages per second

This shows the number of messages processed by Exchange per second. Large changes to this number, either up or down, could indicate problems.

N/A

Messages queued for submission

Shows the current number of submitted messages not yet processed by the transport system

Should not exceed 50. Queue should not persist for more than 15 mins.

Remote Procedure Call (RPC) packets operations per second

Shows the rate at which RPC operations occur, per second.

N/A

RPC latency, requests, and slow packets

This measures the overall performance of the RPC subsystem

Fast is better

RPC sent, slow, outstanding, and failed requests (store interface)

Another indicator of RPC performance

Fast is better

Read and write bytes RPC clients per second

Another indicator of RPC performance . Large changes to this number, either up or down, could indicate problems

N/A

Number of active and anonymous users

Number of users connected to the mailbox store

N/A

Database page faults per second

The rate that database file page requests require the database cache manager to allocate a new page from the database cache.

The lower the better

Log record stalls per second

This shows the number of log records that cannot be added to the log buffers per second because the log buffers are full

The lower the better. Log stalls will lead to increases in RPC latency. Could be caused by disk I/O bottlenecks.

Log threads waiting

The number of threads waiting for their data to be written to the log in order to complete an update of the database.

The lower the better

Database cache size in bytes and miss in percent

the amount of system memory used by the database cache manager to hold commonly used information from the database file(s) to prevent file operations.

% miss should be as low as possible

Current unique users (OWA)

Shows the number of active users currently logged into the Outlook Web Application

N/A

Average response time (OWA)

The average time (in milliseconds) that elapsed between the beginning and end of an OEH or ASPX request.

The lower the better

 

Some of the features of this Sensor are also available in separate PowerShell based Sensors. As with all WMI based sensors, this will have a relatively high impact on PRTG’s system performance. We recommend using fewer than 200 WMI based sensors per Probe.

WMI Exchange Transport Queue Sensor

Another general-purpose sensor that will also create individual sensors for the objects selected. This sensor provides statistics for over 30 of the various message queues used to transport email from sender, through the Exchange system and to the recipient. You can choose different sensors for each of the “high”, “normal”, “low” and “none” message priorities that Exchange uses, as well as “total” to see an overall summary. Setting thresholds or limits on some of the more important queues is a great way to ensure that mail delivery is taking place, as any increase in the number message being held in a queue would indicate a message delivery problem. In general, look for low values in the “queue length” channels and high values in the “items completed” channels.

 

 

Find more information at the manual.

 

The rest of the pre-defined sensors are all PowerShell based, so there are some pre-requisites that must be taken care of before they can be used.

Both Remote PowerShell and Remote Exchange Management Shell must be enabled on the target system, and PowerShell 2.0, or later, must be installed on the server running the Probe on which the sensor is running. This page provides details of how to use PowerShell based sensors. In particular, make sure the execution policy is set to "unrestricted" to allow scripts to run. This needs to be done for the version of PowerShell that is invoked by PRTG, which is not necessarily the version that appears on the Start Menu. If this isn't done you will probably get an "Unauthorised Access" error on the sensor.

To fix this, on the Core Server or Remote Probe, where the sensor is to be created, open a CMD prompt as an Administrator (not a PowerShell session) and type the following:


%systemroot%\SysWOW64\WindowsPowerShell\v1.0\powershell.exe


When the command prompt changes to "PS", enter the following command:


set-executionpolicy unrestricted

Exchange Backup Powershell Sensor

This PowerShell based sensor must be assigned to a server holding the Mailbox server role (rather than a CAS or Transport Server). It will return details of the backup status of the mail database(s) held on the server. It contains channels for:

  • Time Since Last Full Backup
  • Time Since Last Copy Backup
  • Backup Currently in Progress

Use this to keep a check on the history of your Exchange backups. Setting a limit on the “Time Since” channels will notify you if the system goes too long without a backup being taken.

Find more information at the manual.

Exchange Database Powershell Sensor

Another PowerShell based sensor, this one checks the operational state of the database that holds the individual mailboxes and specifically reports on:

  • Database Size
  • Mount State
  • Validity

Limits on the “Validity” and “Mount State” channels will notify you if the database goes offline or experiences corruption.

 

Find more information at the manual.

Exchange Database DAG Powershell Sensor

Introduced in Exchange 2013, Database Availability Groups (DAGs) form the basis of a high availability resilience feature for Exchange. With a DAG being a group of up to 16 (in Exchange 2016) Mailbox servers that host a set of mail store databases that can provide automatic database level recovery in the event of a failure of individual servers or databases. This sensor provides detailed information on the status of an Exchange DAG:

  • Overall DAG status (for example, if it is mounted, failed, suspended)
  • Copy status (active, not active)
  • Content index status (healthy, crawling, error)
  • If activation is suspended
  • If log copy queue is increasing
  • If replay queue is increasing
  • Length of copy queue
  • Length of Replay queue
  • Number of single page restores

Limits assigned to the various queue lengths will notify the administrator of problems with DAG replication.

 

Find more information at the manual.

 

Exchange Mail Queue PowerShell Sensor

The Mail Queue Sensor monitors the number of items in the outgoing mail queue of an Exchange Server. Like the WMI based Transport Queue Sensor mentioned above, this is a great sensor for checking that outbound email is leaving your mail system. Assigning limits to the various channels in this sensor will allow the administrator to immediately see if messages are backing up. For all channels, lower values are better.

  • Number of queued mails
  • Number of retrying mails
  • Number of unreachable mails
  • Number of poisonous mails

Find more information at the manual.

 

Exchange Mailbox PowerShell Sensor

The Mailbox Sensor returns metrics for individual user and system mailboxes. The data returned includes

  • Total size of items in place
  • Number of items in place
  • Past time since the last mailbox logon

Assigning limits on this sensor is a great way for Admins to be warned when individual mailboxes are approaching policy size limits, and identifying unused or orphan mailboxes.

Find more information at the manual.

 

Exchange Public Folder PowerShell Sensor

Microsoft have been talking about deprecating Public Folders in Exchange for several years, but they’re still available in Exchange 2016. This sensor returns the same statistics for Public Folders as the Mailbox Sensor does for individual mailboxes (see above):

  • Total size of items in place
  • Number of items in place
  • Past time since the last mailbox logon

Find more information at the manual.

These out-of-the-box sensors will provide Exchange Admins with a good overview of the health of their systems. But by using PRTG’s Custom Sensors we can get an even deeper insight into how well our Exchange servers are performing and we’ll look into how to do this in the next part of this series.

Part 2: Your Secret Weapon for Monitoring Exchange: Custom WMI, PerfMon and Script Sensors
Part 3: Metrics That Matter: Processor and Process Metrics for MS Exchange
Part 4: PRTG & The Exchange Admin - Metrics That Matter: Memory