Introducing NetSpyGlass


NetSpyGlass simplifies the complexity of network performance monitoring and diagnostics for the most challenging networks – those that are large, growing and/or constantly changing to support the demands of the business. It is a fully-programmable, highly-scalable automation platform providing real-time network mapping, monitoring, visualization, analytics and more for multi-vendor data center and WAN network operators.

What makes NetSpyGlass unique among the many network monitoring solutions available?

NetSpyGlass is a Fully-Programmable Platform


NetSpyGlass is fully-programmable by way of its embedded Python interpreter that processes scripts to perform a vast array of functions – from simple to highly complex and interdependent. What this means for NetSpyGlass users is that they can solve the most challenging network monitoring problems (at scale) with programmatic precision!

A high-level overview of the NetSpyGlass architecture is illustrated below. While conceptually simple to grasp, the platform provides both the power and the flexibility to accommodate a wide range of use cases.



Key architectural features include:

  • A monitoring data collection subsystem designed to accommodate a wide variety of SNMP data sources, including network devices from most leading vendors.
  • A powerful visualization system that enables users to hierarchically organize and manage network maps with drill-down, overlays of monitoring data, color-coded notifications, ad hoc views and more.
  • An external time-series database to store monitoring data in support of a number of critical features such as real-time visualization, reporting, search, analytics and more.
  • A sophisticated classification and tagging system that builds an internal object data model for powerful identification, isolation and viewing capabilities.
  • Integration with external business systems for access to monitoring data via a well-documented REST API.
  • An advanced alerting system that ensures that no actionable network event will go unnoticed while simultaneously reducing false positive alerts.
  • Sophisticated graphs, dashboards and reporting tools to support the most demanding analytical and reporting requirements.

Embedded Python Interpreter


At the core of the NetSpyGlass architecture is an embedded Python interpreter upon which the “fully-programmable” feature of the platform is founded.

Right out of the box, NetSpyGlass can perform discovery, mapping and most core functions with little or no programmatic intervention due to the many libraries, modules and scripts included in the product’s standard configuration. However, appreciating the true power of the platform requires a basic understanding of the central role of the Python interpreter and how it can be leveraged to automate literally – anything! Everything, beginning from adding devices to the configuration, all the way to data processing scripts and alerts, can be scripted and automated.

Calculations


An example of a Python script used to calculate interface traffic in bits/sec is shown below:
(these three lines of Python can process hundreds of thousands of metrics)


if_hc_in_octets = import_var('ifHCInOctets')
if_in_rate = mul(median(rate(if_hc_in_octets, 4)), 8)
export_var('ifInRate', if_in_rate)

The Python interpreter is used to process collected monitoring data. When operating on lists of monitoring variables, scripts are used to modify or create new variables, create log records, add or modify tags on monitoring variables, trigger alerts and much more.

Analytics


The embedded Python interpreter is also accompanied by a set of NetSpyGlass proprietary Python modules which enable the broad range of available functions such as performing calculations with monitoring data (in Python) and generating new metrics. An example of this is shown below:


from nw2functions import *

# Compute sum of outbound traffic through interfaces of all devices that
# have tag 'ifBGP4Peer.AS174'. This tag is added automatically to all interfaces
# that carry BGP peering sessions with AS174 (COGENT). Add result as a new instance
# the same monitoring variable 'ifOutRate'

if_out_rate = filter_by_tags(import_var('ifOutRate'), ['ifBGP4Peer.AS174'])
aggr = new_var('Cogent', 'peering')
aggregate(aggr, if_out_rate)
aggr.addTag('VariableTags.Aggregate')
export_var('ifOutRate', [aggr])

Simple Python scripts as shown above can be used to perform powerful analytical processing such as metric data normalization, calculating aggregates and statistical analysis.

Alerts


Alerts are programmed in Python and can implement complex logic, including data manipulation and filtering by device, component or tag as shown in the sample script below:


def alert_edge_paid_egress_above_90pct(log):
    mvlist = import_var('ifOutUtilization')
    # alert on interfaces with role eBgpPeer and ifDescription.PAID and only if they are active
    mvlist = filter_by_tags(mvlist, ['ifRole.eBgpPeer', 'ifDescription.PAID', 'ifDescription.ACTIVE'])
    alert(
        name='alert_edge_paid_egress_above_90pct',
        input=mvlist,
        condition=lambda mvar, value: value > 0.9,
        description='Paid peering interface egress utilization >90%',
        details={
            'Device Owner': 'neteng-team',
            'Device Name': '$alert.deviceName',
            'Device Metro': '$alert.getTagWords("Metro")',
        },
        duration=600,             # time interval to analyze
        percent_duration=85,      # condition must be true for 85% of the interval
        notification_time=1800,
        streams=['log', 'jira'],  # log and open Jira ticket
    )

This kind of scripting capability enables users to move far beyond static thresholds when designing an alerting system and incorporate complex interdependencies between devices and interfaces and between physical and logical connections. A key benefit of this approach is a dramatic reduction or elimination of false positive and redundant alerts.

System Configuration File


NetSpyGlass uses a plain text file to configure and manage all functions the platform is capable of performing. This configuration file has a hierarchical structure and consists of individual variables, dictionaries and lists. The syntax is loosely based on JSON with some notable differences such as the ability to add comments, an example of which can be seen below:


#Default configuration file defines the following polling configurations:

polling = {

    v1public : {
        protocol : snmp,
        version : 1,
        community : public
    },

    v2public : {
        protocol : snmp,
        version : 2,
        community : public
    },

    v3md5public : {
        protocol : snmp,
        version : 3,
        secLevel : authNoPriv,
        auth : MD5,
        user : snmpuser,
        password : public
    },

}

Complete syntax rules are defined in the Configuration File Syntax section of the online documentation.

When NetSpyglass is installed from an rpm or deb package, the package installs a fully-commented prototype system configuration file within which user-defined parameters are placed. The NetSpyGlass server initializes with a “default” configuration loaded from a built-in system configuration file that is subsequently merged with the prototype configuration file to completely configure the system. This “default” configuration file contains all recognized configuration parameters and is also fully-commented to make configuring the system as near an intuitive exercise as possible.

When the NetSpyGlass system polls any of the prototype configuration files, parameters defined within these files then override corresponding parameters from the default built-in configuration file. And significantly, the NetSpyGlass system configuration changes are implemented immediately without a server reboot.

The benefit of this approach is that it provides an extremely scalable method for configuring and managing the entire NetSpyGlass platform configuration through either manual or programmatic means with zero downtime for configuration changes.

NetSpyGlass Server Query Language


NetSpyGlass incorporates a proprietary query language loosely based on SQL syntax (i.e. NsgQL) that can be used to select monitoring variables, devices and components. NsgQL can be used to build queries that access monitoring data, devices and components in NetSpyGlass. Implementing an SQL-like query language requires that NetSpyGlass objects and data are mapped to SQL “tables” and “columns” as follows:

  • monitoring variable names, words “tags”, “devices” and “maps” can be used in place of a “table” after SELECT. For example:

SELECT column FROM ifInRate
SELECT column FROM ifOperStatus
SELECT column FROM cpuUtil
SELECT column FROM tags
SELECT column FROM devices
SELECT name FROM maps

  • any variable name, any tag facet name, as well as words “device”, “component”, “interface”, “address”, “BoxDescr”, “time”, “metric”, “tag”, “tagFacet”, “ViewId” can be used in place of a “column” and in the matching expression after WHERE

SELECT device,component,metric FROM ifInRate WHERE device=dfw3-dr01-re0 AND ifDescription=MET

General syntax of the SELECT query is as follows:


SELECT column FROM variableName WHERE matchingClause ORDER BY column [DESC]

NsgQL is a work in progress but the above examples illustrate the overall approach to providing NetSpyGlass users with a powerful yet familiar query capability. The NetSpyGlass Server Query Language section of the online documentation provides a more complete introduction, including a section on Using NsgQL queries in Python scripts.

Programmability Conquers Complexity


NetSpyGlass was built by network operators for network operators who use the platform in production networks and who work very closely with the NetSpyGlass development team on driving the platform feature set. With NetSpyGlass, network operations teams now have a consolidated toolset with which to reduce costs associated with network downtime while increasing performance through granular visibility, more effective network operations workflows, analytical insights and planning support.

An embedded Python Interpreter, programmatically accessible system configuration and integrated query language collectively define NetSpyGlass as a fully-programmable automation platform for network monitoring with unlimited customization capabilities. NetSpyGlass is unique in its ability to enable network operators to conquer complexity (at scale) with programmatic precision.