Common Troubleshooting
This section describes the definition and process of common troubleshooting.
Definition
Common troubleshooting refers to the troubleshooting of faults that have no adverse impact on storage system performance and host services. The fault level is minor.
Process
Figure 2-1 shows the troubleshooting flowchart.
Table 2-4 describes operations involved in common troubleshooting.
Operation |
Description |
---|---|
Collect fault information. |
If a fault occurs, collect information required for troubleshooting so that the fault can be quickly located and rectified. For details about information to be collected, see Collecting Fault Information. |
Log in to OceanStor DeviceManager. |
Log in to OceanStor DeviceManager to query the operating status of a storage system and whether an alarm has been generated. |
Handle the event by taking the recommended action. |
If the event information is queried in OceanStor DeviceManager, handle the event. For details, see the recommended action in OceanStor DeviceManager or Event Reference. |
Locate the fault cause. |
Find out the exact cause of the fault from multiple possible causes, using analyzing, comparing, and other possible methods. For details on common fault locating methods, see Troubleshooting Methods. |
Contact Huawei technical support. |
If you cannot rectify the fault, collect fault information and contact Huawei technical support. |
Figure 2-2 shows the troubleshooting flowchart for environment faults.
Table 2-5 describes operations involved in environment fault troubleshooting.
Operation |
Description |
---|---|
Collecting host logs |
Collecting host logs includes collecting application logs and operating system logs.
|
Checking link status |
System faults will occur if network links are down. If a system fault occurs, you need to check whether cables are correctly connected and whether indicators on ports to which cables are connected are normal. |
Collecting switch logs |
You can check switch status and packet loss on ports based on collected switch information. Then rectify faults accordingly. |
Collecting storage system fault information |
Alarms will be generated if software or hardware of a storage system is working incorrectly. You can rectify faults by taking recommended actions in the alarms. |