FusionInsight HD V100R002C70SPC203 version is used on the live network, and the
management plane is isolated from the service plane. After the operating system
is installed on the H3C RG4900_G2 server, Huawei FusionInsight HD software is
installed. After the installation is complete, the platform runs properly, but
a large number of alarms indicating slow disk are reported. Each of the 30
servers reports this alarm, and the slow disk slot is irregular. In addition,
the number of slow disks increases as the volume of imported data increases.
FusionInsight reports a slow disk alarm.
According to the alarm statistics, all servers report alarms and the slots are
random. As the service volume increases, the number of slow disks increases. It
is indicated that the problem is not caused by hardware faults.
On the live network, there are RAID1 disks and non-RAID disks. There are slow
disks of the two types.
It is preliminarily determined that the RAID controller card is abnormal. The
hardware fault is excluded. Check the disk statistics of the operating system.
Run the ostat -xm -n 1 command and find out that the difference between
the value of await and the value of svctm is large. The value of %until is 100%.
It is preliminarily determined that the RAID controller card driver matches the
operating system. After the RAID driver is updated to the latest version, the
slow disk alarm on FusionInsight is cleared and services are normal.
The RAID controller card driver is faulty.
It is recommended that the hardware driver be upgraded to the latest version
A large number of hardware alarms are usually not caused by hardware faults but
caused by incompatibility issues.
Big data is decoupled from hardware. You are advised to upgrade the hardware
driver to the latest version during deployment.