No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

The System Displays the vos_queue_send fail(errno=11) at Greater Pressure When the SOL Is Used

Publication Date:  2015-06-19 Views:  130 Downloads:  0
Issue Description
Hardware configuration:
T6000 server

Software configuration:

ipmitool software tool

Symptom:

After the serial over LAN (SOL) is connected by using ipmitool, the following information is output from the baseboard management controller (BMC) serial port occasionally in the case of extremely large SOL data pressure and few BMC CPU resources:

vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
vos_queue_send fail(errno=11)
Handling Process
1.  Before processing data, the SOL module of the BMC encrypts the data.

2.  Multiple computing resources of the BMC are consumed in the case of large SOL data flow.
  • Extremely large data flow: List the content of a directory or file in the operating system (OS) over SOL without rest.
  • A large number of computing resources are consumed. For example, when the baud rate is 115200, the BMC occupies about over 70% CPU resources.
3.  SOL data is lost at the extreme pressure.

Assume that the following two conditions (extreme pressure) occur at the same time,
  • When the baud rate is 115200, operation personnel list the content of a directory or file in the host (OS) over SOL without rest.
  • Query the sensor based on the ipmitool sensor list in the host (OS). The keyboard-controller style interface (KCS) channel is used, and many BMC CPU resources stop processing KCS data.
If the two types of pressure exists at the same time, "vos_queue_send fail(errno=11)" is displayed because the BMC has no sufficient CPU resources to process SOL data.

4.  When the BMC does not have time to process SOL data, certain SOL data is lost in the BMC. In this case, "vos_queue_send fail(errno=11)" is output from the BMC serial port, indicating that the queue is full, and certain serial port data is lost.
Root Cause
The BMC processor cannot process pressure in certain special scenarios.



Solution
Reduce the system baud rate. Less serial port data is generated in the host, and BMC CPU resources have time to process the data. Reduce the system baud rate from 115200 to 57600, 38400, 19200, or 9600. Change the baud rate based on the pressure.

1.  The baud rate is 115200,
  • List the content of a directory or file in the host (OS) over SOL without rest.
  • Run one ipmitool sensor list cycle in the host (OS).
The preceding information is displayed.

2.  The baud rate is 57600,
  • List the content of a directory or file in the host (OS) over SOL without rest.
  • Run two ipmitool sensor list cycles in the host (OS).
The preceding information is not displayed,
  • List the content of a directory or file in the host (OS) over SOL without rest.
  • Run four ipmitool sensor list cycles in the host (OS).
The preceding information is displayed.

In addition, other applications affect the BMC processor capability. For example, a large number of BMC CPU resources are consumed when you copy large files through the virtual CD-ROM drive. The preceding information is displayed.

Take the Red system for example, if the baud rate is set to 9600, modify the following two files:

/etc/grub.conf



/etc/inittab


Suggestions
  1. SOL data loss cannot be avoided, which is determined by the processing capability of the current BMC system.
  2. When data is lost, the preceding information is displayed to inform users that certain SOL data is lost in the BMC.
  3. The information can prompt users, and is displayed only at the serial port. It has no impact on data use

END