The FusionInsight HD V100R002C70SPC203 version is used on the live network, and the management plane is isolated from the service plane. After the operating system is installed on the H3C RG4900_G2 server, Huawei FusionInsight HD software is installed. After the installation is complete, the platform runs properly, but a large number of alarms indicating slow disk are reported. Each of the 30 servers reports this alarm, and the slow disk slot is irregular. In addition, the number of slow disks increases as the volume of imported data increases.
The FusionInsight reports a slow disk alarm.
1. According to the alarm statistics, all servers report alarms and the slots are random. As the service volume increases, the number of slow disks increases. It is indicated that the problem is not caused by hardware faults.
2. On the live network, there are RAID1 disks and non-RAID disks. There are slow disks of the two types.
3. It is preliminarily determined that the RAID controller card is abnormal. The hardware fault is excluded. Check the disk statistics of the operating system.
4. Run the ostat -xm -n 1 command and find out that the difference between the value of await and the value of svctm is large. The value of %until is 100%.
5. It is preliminarily determined that the RAID controller card driver matches the operating system. After the RAID driver is updated to the latest version, the slow disk alarm on FusionInsight is cleared and services are normal.
1. The RAID controller card driver is faulty.
2. It is recommended that the hardware driver be upgraded to the latest version during deployment.
1. A large number of hardware alarms are usually not caused by hardware faults but caused by incompatibility issues.
2. Big data is decoupled from hardware. You are advised to upgrade the hardware driver to the latest version during deployment.