No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

CX11x, CX31x, CX710 (Earlier Than V6.03), and CX91x Series Switch Modules V100R001C10 Configuration Guide 12

The documents describe the configuration of various services supported by the CX11x&CX31x&CX91x series switch modules The description covers configuration examples and function configurations.
Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Principles

Principles

This section describes the implementation principle of fault management.

Concepts

  • Alarm:

    An alarm is generated when a device notifies users of a fault. Maintenance personnel learn the device running status and locate faults based on alarms.

  • Active alarm:

    An active alarm is the notification of generating an alarm.

  • Clear alarm:

    A clear alarm is the notification of clearing an alarm.

    NOTE:

    Each active alarm maps a clear alarm.

  • Root-cause alarm:

    A root-cause alarm causes other alarms. For example, an unreachable route is caused by the interface fault. The alarm generating due to an interface fault is a root-cause alarm.

  • Non-root-cause alarm:

    Non-root-cause alarms are caused by the root-cause alarm. For example, an unreachable route is caused by the interface fault. The alarm generating due to the unreachable route is a non-root-cause alarm.

  • Intermittent alarm:

    If the interval between the generation time and clearance time of an alarm is shorter than a specified value (called intermittent threshold that is specified based on the products and alarms), the alarm is called an intermittent alarm. An intermittent alarm lasts a short time of period from generation to clearance.

  • Flapping alarm:

    If the number of times that an alarm on an object is generated in a specified period is larger than the flapping threshold that is specified based on the product and alarm, the generated alarms are called flapping alarms.

Principles

When receiving alarms generated on the device, the fault management module stores them based on the default severities, records the generation time.

After the fault management function is configured, you can:
  • Change the alarm severities on the device and configure filtering rules on the NMS to filter out unnecessary alarms.
  • Enable the alarm reporting delay function to prevent alarms from being reported repeatedly. When the alarm or event reporting delay period expires, only the last alarm is reported.
  • Alarm correlation can identify root-cause and non-root-cause alarms. Non-root-cause alarms are filtered out. The system reports only root-cause alarms to the NMS, improving the efficiency for locating faults.

Alarm Severity

As defined in X.733, alarms are classified into four severities, as described in Table 3-15. A small value indicates a higher severity.

Table 3-15 Descriptions of alarm severities

Value

Severity

Description

1

Critical

Indicates that a fault affecting services has occurred and it must be rectified immediately.

2

Major

Indicates that services are being affected and related measures need to be taken urgently.

3

Minor

Indicates that a fault occurs but does not affect services. To avoid a minor alarm from getting severer, related measures must be immediately taken.

4

Warning

Indicates that a potential or impending service-affecting fault is detected before any significant effects have been felt. Take corrective actions to diagnose and rectify the fault.

Alarm Correlation

When an NE on the network is faulty, the system reports the predictable network faults to the NMS. Each fault triggers multiple alarms, affecting the efficiency for locating faults. Some alarms are triggered by the same fault, so they are associated with each other. The alarm correlation function can analyze the association among alarms generated in the system and determine root-cause and non-root-cause alarms. After alarm correlation suppression is configured, the system reports only root-cause alarms to the NMS, improving the efficiency for locating faults. Alarm correlation reduces the number of reported non-root-cause alarms, reduces the network load, and helps quickly locate faults.

Alarm correlation includes alarm correlation analysis and alarm correlation suppression.
  • Alarm correlation analysis

    Alarm correlation analysis is performed based on the alarm definition, including the fault time window, fault rectification time window, what the root-cause alarm is, and the method of association with the root-cause alarm. After receiving an alarm, the fault management module analyzes the alarm correlation within the duration specified by the fault time window, and sends the alarm with the analysis result to the NMS.

    Figure 3-26 shows the alarm correlation analysis flow.
    Figure 3-26 Alarm correlation analysis flow

    The status of an alarm can be:

    • Active & Independent: An active root-cause alarm is in the period specified by the fault time window.
    • Active & Dependent: An active non-root-cause alarm is in the period specified by the fault time window.
    • Persistent Event: An alarm enters the active alarm queue.
    • Filtered Out: An alarm and its clear alarm are both deleted from the active alarm queue.

    The process of alarm correlation analysis is as follows:

    1. If an alarm is considered as a root-cause alarm, it is suppressed for a period specified by the fault time window.
    2. If an alarm is considered as a non-root-cause alarm, it is suppressed for a period specified by the fault time window of its parent alarm.
    3. When the duration specified by the fault time window of the root-cause alarm expires, the system sends both the root-cause and non-root-cause alarms.
    4. If the clear alarm of a non-root-cause alarm is generated within the duration specified by the fault time window, the system deletes them both.
    5. If the clear alarm of a root-cause alarm is generated within the duration specified by the fault time window, the system deletes them both.
    6. If the root-cause alarm of an alarm is not generated within the duration specified by the fault time window, the system considers this alarm as a root-cause alarm.
    7. If the root-cause alarm of an alarm is generated within the duration specified by the fault time window, the system considers this alarm as a non-root-cause alarm.
  • Alarm correlation suppression

    When alarm correlation analysis is complete, an alarm carries an identifier indicating a root-cause alarm or a non-root-cause alarm. Before sending an alarm to the SNMP agent, the system checks whether NMS-based alarm suppression has been configured for non-root-cause alarms.

    • If so, the system filters out non-root-cause alarms and sends only root-cause alarms to the NMS.
    • If not, the system sends both root-cause alarms and non-root-cause alarms to the NMS.
Translation
Download
Updated: 2019-08-09

Document ID: EDOC1000041694

Views: 58392

Downloads: 3621

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next