Monitoring provider networks is essential to ensure service delivery is never interrupted. There are several tools and protocols that can be used to perform this monitoring, but in this article we will focus on Zabbix with monitoring via SNMP v2.

Starting with the network edge router or BGP router, we have the following items to be prioritized in monitoring: BGP sessions, physical and logical interfaces (in these are monitored traffic, operational status, error rate, and fiber signal), and finally , the equipment status (temperature, processing and memory usage).

Monitoring BGP sessions is essential for detecting and correcting problems that can cause slowdowns and even interruptions in network browsing and make the experience of the provider’s end customer bad. The monitoring of the interfaces guarantees the quick detection of fiber breaks, ports with problems, thus accelerating the process of correcting problems that can cause interruption in the connection delivery to the client.

Status information is the health of the equipment, monitoring the status it is possible to tell if the equipment needs more cooling, if it needs more memory (if possible) and even plan to change the equipment before it reaches its limit .

Then we have the network backbone, which is generally made up of high-capacity switches, CGNAT and BNG/BRAS/Concentrators, in which we monitor the following items:

Switches and CGNAT – Physical and logical interfaces and equipment status;

BNG/BRAS/Concentrator – In addition to interfaces and status, we can monitor the number of active PPPoE clients.

Monitoring the number of active PPPoE clients helps in detecting client outages, in addition to monitoring OLTs, it becomes a powerful tool because before clients start calling to ask what happened, the provider will already be solving the problem.

In monitoring the OLTs we have the uplink interfaces (contains the same items as the physical interfaces), equipment status and traffic per PON port. PON monitoring helps to detect more precisely which group of customers was left without navigation or had a fiber break problem.

Finally, on the servers, we monitor physical servers (hypervisor or bare metal) and VMs. Monitoring of VMs includes system uptime and running processes, as well as VM and Server status. Server monitoring consists of monitoring the general status of the equipment.

By monitoring the VMs, we guarantee that the services running on them remain available. By monitoring the server we have control of the health of the equipment, being able to plan upgrades or equipment changes before it reaches the limit.

Lucas Kazama | Made4noc coordinator