Basic Principles
Basic fault locating principles help you exclude useless information and locate faults.
During troubleshooting, observe the following principles:
- Analyze external factors first, and then internal factors. When locating faults, consider the external factors first.
- External factor failures include failures in optical fibers, optical cables, power supplies, and customers' devices.
- Internal factors include disks, controllers, and interface modules.
- Analyze the alarms of higher severities and then those of lower severities. The alarm severity sequence from high to low is critical alarms, major alarms, and warnings.
- Analyze common alarms and then uncommon alarms. When analyzing an event, confirm whether it is an uncommon or common fault and then determine its impact. Determine whether the fault occurred on only one component or on multiple components.
To improve the emergency handling efficiency and reduce losses caused by emergency faults, emergency handling must comply with the following principles:
- If a fault that may cause data loss occurs, stop host services or switch services to the standby host, and back up the service data in time.
- During emergency handling, completely record all operations performed.
- Emergency handling personnel must participate dedicated training courses and understand related technologies.
- Recover core services before recovering other services.