The storage array was connected to the application server through iSCSI connections. The Asianux host used IOmeter to perform read/write testing on the LUN mapped from the storage array. During the read/write operations, an iSCSI link-down occurred (caused by a cable removal or unexpected power-off). The host CPU utilization was close to 100% and the host was not responding even to SSH or KVM login attempts.
Product and version information:
S5500T V100R001 V100R002
S5600T V100R001 V100R002
S5800T V100R001 V100R002
S6800T V100R001 V100R002
S3900 V100R001 V100R002
S5900 V100R001 V100R002
S6900 V100R001 V100R002
Application server operating system: Asianux 3 SP2 for X86_64
Application server native iSCSI initiator version: iscsi-initiator-utils-220.127.116.118-0.18.1AXS3
Dynamo on Linux has an infinite retry mechanism. When a slave block device returned an i/o error to upper-layer applications, Dynamo retried this failed I/O immediately, but the iSCSI driver returned this retry I/O as an error. The above situation was a logical infinite loop. This infinite loop resulted in a CPU utilization of close to 100% as shown below:
When the CPU utilization was close to 100%, the host may not respond to any external events.
We ran the same test on a Red Hat5.4 host, and the same symptoms occurred.
When using IOmeter on a Linux host which is connected to the storage array through iSCSI connections, avoid iSCSI link-downs; otherwise, the Linux host CPU utilization may approach 100%.