AI Fabric: Visualized Network O&M
Leveraging Telemetry and iMaster NCE-FabricInsight — an intelligent analysis platform, AI Fabric can collect network status data in real time.
Telemetry: Proactive Push Mechanism
Leveraging Telemetry, CloudEngine series switches can push high-precision performance metric data of network devices to iMaster NCE-FabricInsight, collecting AI Fabric network device status data.
Telemetry is a network monitoring technology developed to quickly collect performance data from physical or virtual devices remotely. Compared with traditional network monitoring technologies, Telemetry enables network devices to push high-precision performance data to iMaster NCE-FabricInsight at a high rate in real time, improving the utilization of device and network resources during data collection.
As shown in Figure 1-2, Telemetry has the following advantages over the traditional network monitoring technology (SNMP GET):
- Sampled data is proactively pushed, increasing the number of nodes to be monitored.
In the SNMP query process, the network management system (NMS) and devices interact with each other through requests and responses, and the pull mode is used. If 1000 query requests need to be sent in the first minute, SNMP query requests need to be parsed 1000 times. In the second minute, SNMP query requests are parsed another 1000 times. This process is subsequently repeated. The 1000 SNMP query requests that are parsed in the first minute are the same as those in the second minute. Parsing these requests consumes CPU resources of devices, and therefore the number of monitored nodes must be limited to ensure normal device running.
In the Telemetry process, the NMS and devices interact with each other in push mode. In the first minute, the NMS sends 1000 subscription packets to a device, and the device parses these subscription packets. During the parsing, the device records the subscription information of the NMS. In the following minutes, the NMS does not send subscription packets to the device. Instead, the device automatically and continuously pushes the subscribed data to the NMS based on the recorded subscription information. In this way, there is no need to parse 1000 subscription packets in each subsequent minute. Telemetry conserves the CPU resources of the device and allows more devices to be monitored.
- Sampled data is packed and then reported, improving the time precision of data collection.
In the SNMP query process, a device needs to parse a large number of query requests every minute and reports only one piece of sample data for each query request. Parsing query requests consumes CPU resources of the device. To ensure normal running of the device, the frequency at which the NMS sends query requests must be limited. This reduces the time precision of data collection. In most cases, the sampling precision of traditional network monitoring technologies is in seconds.
In Telemetry, a device parses subscription packets only in the first minute. The device uploads multiple pieces of sample data for a subscription packet by packing the data, thereby reducing the number of packets it exchanges with the NMS. This means that the Telemetry sampling precision can be milliseconds or even subseconds.
- Sampled data contains timestamps, improving the accuracy of sample data.
In traditional network monitoring, sampled data does not contain timestamps. Due to the network latency, this leads to inaccurate network node data on the NMS.
In the Telemetry process, sampled data contains timestamps. During the parsing, the NMS can determine the time when the sample data is generated, minimizing the impact of the network transmission latency on the sampled data.