No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Huawei Server Maintenance Manual 09

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
V5

V5

Common Problems of RAID Controller Cards and Hard Disks

Occasional Initialization Failure of the LSI SAS3508 RAID Controller Card When the V5 Server Is Powered On and Off Repeatedly
Problem Description
Table 5-51 Basic information

Item

Information

Source of the Problem

CH121 V5, 2288H V5

Intended Product

V5 servers

Release Date

2018-05

Keyword

V5, power-on, power-off, LSI SAS3508, RAID controller card, initialization failure

Symptom

During the long-term ORT reliability test of the LSI SAS3508 RAID controller card, initialization failure may occur at a low probability when the server is repeatedly powered on and off using AC power supply (simulating extreme scenarios). When the RAID controller card fails to be initialized, the OS fails to be started.

Trigger conditions:

  1. The control node uses the LSI SAS3508 RAID controller card.
  2. The PCB version of the LSI SAS3508 RAID controller card is .A.
  3. The current write policy of the LSI SAS3508 RAID controller card is Write Back or Write Back with BBU.
  4. The entire chassis is powered on and then powered off, or a compute node is removed and then inserted.

Fault symptom:

The boot device is not found during server startup, and the OS fails to be started.

Identification method:

  1. Obtain the iBMC IP address of the compute node or the 2288H V5 server from the network design document, and log in to the WebUI. The default user name is Administrator, and password is Admin@9000.

  2. Choose Information > System Info > Storage. Check whether the Type of the RAID controller card is LSI SAS3508. If yes, go to the next step; if no, this article is not applicable.

  3. Check whether the PCB version of the RAID controller card is .A. If yes, go to the next step; if no, this article is not applicable.

  4. Check whether Current Write Policy of the RAID controller card is Write Back or Write Back with BBU. If yes, this article is applicable; if no, this article is not applicable.

Key Process and Cause Analysis

Cause:

A consistency problem exists in the chipset of the LSI SAS3508 RAID controller card. When the server is powered on or off repeatedly using AC power supply, a signal metastable state may occur at a low probability (0 or 1 at random). As a result, the RAID software does not enter the power-off protection process, and initialization fails.

In the firmware of the LSI SAS3508 RAID controller card, the default BIOS mode is Stop on error. In this mode, when an error or configuration change occurs on the FW, the status of the UEFI driver is set to Not healthy during the startup. To log in to the OS, press F11 during server startup, and restore the driver status on the Device Manager screen.

Conclusion and Solution
NOTE:

This solution applies only to NFV scenarios. This solution affects the performance. In other scenarios, use this solution based on the actual service evaluation.

Rectification method:

If the fault described in this case occurs, perform the following operations to rectify the fault.

  1. Log in to the Device Manager screen of the LSI SAS3508 RAID controller card.

    1. Log in to the iBMC WebUI, and choose Remote Console > Java Integrated Remote Console (Shared) to access the KVM.

    2. Restart the server on the KVM.

    3. During the startup, press F11 when prompted. Then enter the password.

    4. Enter the password (the default password is Admin@9000) and press Enter. On the management screen, choose Device Manager.

  2. On the Device Manager screen, choose Some drivers are not healthy.

  3. On the Driver Health screen, choose Repair the whole platform.

  4. "Memory/battery problems were detected" is displayed.

  5. Press Enter.

  6. Enter c and press Enter twice. If the following screen is displayed, the configuration is complete.

  7. Use the KVM to restart the server.

Solution:

For V5 servers on the live network, the problem may occur in three scenarios.

  1. The OS has been installed on the server and is running properly.
  2. A RAID group has been created on the server, but the OS is not installed.
  3. No RAID group is created on the server.

Scenario 1: The OS has been installed on the server and is running properly.

  1. Obtain MegaRAID Storcli.

    1. Log in to the Broadcom website, and choose DOWNLOADS > Management Software and Tools. The address is as follows:

      https://www.broadcom.com/products/storage/raid-controllers/megaraid-9440-8i#downloads

    2. Download MegaRAID Storcli of the latest version.

    3. Decompress the downloaded tool package, and use FileZilla or WinSCP to upload the rpm tool package from the Linux directory to the first node of FusionSphere OpenStack.

  2. Log in to the head node of FusionSphere OpenStack as the fsp user over SSH. The IP address of head node is the reverse proxy IP address of FusionSphere OpenStack. The default password is Huawei@CLOUD8. Run the su - root command to switch to the root user. The default password is Huawei@CLOUD8!.
  3. Run the source set_env command to import environment variables.

    For V100R006C10SPCXXX, the command output is as follows:

    please choose environment variable which you want to import:

    1. openstack environment variable (keystone v3)
    2. cps environment variable
    3. openstack environment variable legacy (keystone v

    please choose:[1|2|3]

    Enter 1 and press Enter. Enter the password of OS_USERNAME. The default password is FusionSphere123.

    Run the TMOUT=0 command to disable logout on timeout.

  4. Log in to FusionSphere, and choose Summary to view the management IP addresses of the control nodes.

  5. Copy the Storcli tool package to other nodes whose cache mode needs to be modified. (In the following command, XX.XX.XX.XX indicates the management IP address of the control node to be modified).

    scp storcli-007.0504.0000.0000-1.noarch.rpm mailto:fsp@XX.XX.XX.XX:/home/fsp/

  6. Log in to the control node as the fsp user, and run the su – root command to switch to root user. The default password is Huawei@CLOUD8!.
  7. Go to the /home/fsp directory, and run the following command to install the Storcli tool:

    rpm –ivh storcli-007.0504.0000.0000-1.noarch.rpm

  8. Go to the /opt/MegaRAID/storcli directory, and check whether the cache mode of the RAID group is RWBD or RAWBD. If yes, go to the next step.

    ./storcli64 /c0/vall show

  9. Run the following command to change the cache mode of the RAID group to RWTD:

    ./storcli64 /c0/vall set wrcache=wt

  10. Run the following command to check whether cache mode is RWTD:

    ./storcli64 /c0/vall show

  11. Go to the /home/fsp directory, and run the following commands to uninstall and delete the tool package:

    rpm -e storcli-007.0504.0000.0000-1.noarch

    rm storcli-007.0504.0000.0000-1.noarch.rpm

Scenario 2: A RAID group has been created on the server, but the OS is not installed.

  1. Log in to the Device Manager screen of the LSI SAS3508 RAID controller card. For details, see step 1 in "Rectification method".
  2. Choose Device Manager and press Enter.

  3. Choose Avago MegaRAID <SAS3508> Configuration Utility and press Enter.

  4. Choose Main Menu and press Enter.

  5. Choose Virtual Drive Management and press Enter.

  6. Choose the virtual disk to be operated and press Enter.

  7. Choose Advanced... and press Enter.

  8. Choose Default Write Cache Policy, and press Enter.
  9. Choose Write Through and press Enter.

  10. Choose Apply Changes and press Enter. "The operation has been performed successfully" is displayed.

  11. Choose OK and press Enter. The configuration is complete.
  12. Use the KVM to forcibly restart the server.

Scenario 3: No RAID group is created on the server.

  1. Access the main menu screen by referring to step 1 and step 2 in scenario 2. Choose Configuration Management and press Enter.

  2. Choose Create Virtual Drive and press Enter.

  3. Set Write Policy to Write Through.

  4. Choose Save Configuration and press Enter. The confirmation screen is displayed.
  5. Choose Confirm and press Enter.
  6. Choose Yes and press Enter. "The operation has been performed successfully" is displayed.
  7. Choose OK and press Enter. The configuration is complete.
  8. Use the KVM to forcibly restart the server.
Disk Grouping Failure on the CH121 V5
Problem Description
Table 5-52 Basic information

Item

Information

Source of the Problem

CH121 V5

Intended Product

CH121 V5+Avago SAS3408/CH121 V5+Avago SAS3508

Release Date

2018-05-25

Keyword

Avago SAS3408, Avago SAS3508, CH121 V5, SATA controller, host ID, disk grouping failure

Symptom

During the startup, the UVP system allocates the host ID to the RAID controller card. The drivers of the RAID controller card and PCH SATA controller are loaded concurrently. At a low probability, the system may complete loading the driver of the PCH SATA controller first, and the host ID is allocated to the PCH SATA controller instead of the RAID controller card. As a result, disk grouping fails. The following figure shows the PCIe device ID of the PCH SATA controller.

The PCH SATA controller is not used in cloud core network solutions. Therefore, the PCH SATA controller can be disabled.

Key Process and Cause Analysis
  1. The PCH SATA controller is not used in cloud core network solutions. Therefore, the PCH SATA controller can be disabled during customization.
  2. For servers that are already delivered, use uMate to disable the PCH SATA controller. The option for disabling the PCH SATA controller is not displayed in the BIOS. Therefore, the PCH SATA controller can be disabled only by using uMATE.
  3. Choose BIOS Config on uMate.

  4. Enter the BMC management IP address, choose PCHConfiguration, and set SATA Controller and sSATA Controller to Disabled.

    If the product name is incorrect, the device may fail to be discovered.

  5. Restart the OS of the compute node to apply the BIOS configuration.
Conclusion and Solution

Conclusion:

  1. In the E9000 CH121 V5+Avago SAS3508 environment, the system scans the PCI devices during startup. The PCH SATA controller and RAID controller card are scanned concurrently, and the SCSI host ID is allocated to the first discovered device. Therefore, the SCSI host ID may change after the system is restarted.
  2. The PCH SATA controller is not used on the V5 server. Disable the PCH SATA controller by using uMate.

Solution:

  1. Use uMate to disable the PCH SATA controller for products that are delivered.
  2. Disable the PCH SATA controller during production.
Experience

None

Note

None

Problems of HBAs, FC/FCoE Switch Modules and iSCSI Switch Modules

Compute Node UEFI Startup Failure (E9000 V5+HBA)
Problem Description
Table 5-53 Basic information

Item

Information

Source of the Problem

E9000

Intended Product

E9000 V5

Release Date

2018-05-06

Keyword

HBA, UEFI

Symptom

When an HBA is used, the CH121 V5 stays at the following screen during the UEFI startup.

Key Process and Cause Analysis

After an MZ912 NIC is installed on the CH121 V5, "CR has Bad Signature" is displayed during the POST stage when the BIOS is loading the NIC. The problem persists after the compute node is restarted for multiple times.

Conclusion and Solution

Conclusion:

This problem occurs at a high probability, and is caused by a bug in the HBA firmware.

Solution:

Upgrade the HBA firmware to resolve the problem. The upgrade method is as follows:

Download the firmware upgrade tool on the following website: http://support.huawei.com/enterprise/en/software/22969026-SW2000016088

Download the firmware upgrade guide on the following website: http://support.huawei.com/enterprise/en/doc/EDOC1000168684?idPath=7919749%7C9856522%7C9856629%7C21015513

For details about how to upgrade the HBA firmware, see chapter 3 "Installing and Using FwUpgrade" in the firmware upgrade guide.

Temporary solution:

Power off and then power on the compute node. Retry if the problem persists.

Experience

None

Note

None

Imbalanced Traffic Loads of LAG Member Ports in a Stacking Network
Problem Description
Table 5-54 Basic information

Item

Information

Source of the Problem

E9000

Intended Product

E9000

Release Date

2018-03-15

Keyword

Link aggregation (LAG)

Symptom

In a stacking network, when the links to the uplink switches are configured into an aggregation link, the traffic loads of the network ports are imbalanced. The following figure shows the traffic loads of the network ports.

The external switches are stacked. The 2X and 3X switch modules are stacked. An Eth-Trunk link is configured between the external switches and switch modules.

When the traffic load imbalance occurs, all data is sent by the 2X switch module. If the traffic load is not shared by the 3X switch module, the work load of SWITCH1 is increased. As a result, the transmission delay is increased. When the system needs to forward a large amount of data, this problem may degrade the service performance.

Key Process and Cause Analysis
  1. The local preference policy may cause this problem.

    This policy allows the data to be sent preferentially by the device that receives the data. If the local device (device that receives the data) is faulty, the data is sent by other member switch modules.

    The local preference policy reduces the bandwidth usage of the internal communication ports, but increases the bandwidth usage of the local outbound ports and the peer switch. Run the following commands to check whether the local preference policy is enabled for the ports of the Eth-Trunk link:

    [~HUAWEI-Eth-Trunk10]dis this

    #

    interface Eth-Trunk10

    #

    return

    [~HUAWEI-Eth-Trunk10]

    The local preference policy is enabled by default. The command output shows that the local preference policy is enabled. To disable the local preference policy, run the following commands:

    [~HUAWEI-Eth-Trunk10]local-preference disable

    [*HUAWEI-Eth-Trunk10]commit

    [~HUAWEI-Eth-Trunk10]display this

    #

    interface Eth-Trunk10

    local-preference disable

    #

    return

    [~HUAWEI-Eth-Trunk10]

  2. The manual load balancing policy may cause this problem.

    The manual load balancing policy allows multiple member ports to be manually added into the link aggregation group. All ports are in forwarding state and share the traffic load. The CX910 series switch modules support the following load balancing modes: source MAC address, destination MAC address, source XOR destination MAC address, source IP address, destination IP address, and source XOR destination IP address.

    If all incoming traffic is of the same type (such as a MAC address type), the Hash function may cause the traffic to be sent by one device. As a result, traffic load is imbalanced. Run the following commands to check whether the manual load balancing policy is enabled.

    [~HUAWEI]inter eth- 10

    [~HUAWEI-Eth-Trunk10]dis this

    #

    interface Eth-Trunk10

    load-balance dst-ip

    local-preference disable

    #

    return

    [~HUAWEI-Eth-Trunk10]

  3. The LACP policy may cause this problem

    The static LACP mode uses the LACP protocol to negotiate parameters and determine the active and inactive ports. The static LACP mode is also called the M:N mode. This mode enables traffic load sharing and link redundancy. In a link aggregation group, M active links are responsible for forwarding data and performing load balancing, and the other N inactive links are backup and do not forward data.

    If all active ports are in the 2X switch module and all inactive ports are in the 3X switch module, the uneven distribution of the active and inactive ports causes traffic imbalance. Run the following commands to check the distribution of the active and inactive ports:

    [~SwitchA] display eth-trunk 1

    Eth-Trunk1's state information is:

    Local:

    LAG ID: 1 Working Mode: Static

    Preempt Delay: Disabled Hash Arithmetic: profile default

    System Priority: 100 System ID: 0025-9e95-7c31

    Least Active-linknumber: 1 Max Active-linknumber: 2

    Operating Status: up Number Of Up Ports In Trunk: 2

    Timeout Period: Slow

    --------------------------------------------------------------------------------

    ActorPortName Status PortType PortPri PortNo PortKey PortState Weight

    10GE1/17/1 Selected 10GE 100 1 20289 10111100 1

    10GE1/17/2 Selected 10GE 100 2 20289 10111100 1

    10GE1/17/3 Unselect 10GE 32768 3 20289 10100000 1

    Partner:

    --------------------------------------------------------------------------------

    ActorPortName SysPri SystemID PortPri PortNo PortKey PortState

    10GE1/17/1 32768 0025-9e95-7c11 32768 4 20289 10111100

    10GE1/17/2 32768 0025-9e95-7c11 32768 5 20289 10111100

    10GE1/17/3 32768 0025-9e95-7c11 32768 6 20289 10100000

    Unselect indicates that the port is inactive. If the port distribution is imbalanced, you need to re-distribute the ports so that the active interfaces are evenly distributed to the two switch modules.

Experience

None

Note

None

Common Problems of the Management Software

Failed to Boot from PXE on the 2288H V5
Problem Description
Table 5-55 Basic information

Item

Information

Source of the Problem

CH121 V5

Intended Product

V5 servers

Release Date

2018-02-21

Keyword

V5, PXE, stateless computing

Symptom

Stateless computing is used on E9000 CH121 V5 compute nodes to modify server BIOS configurations in batches. After the modification, the OS fails to be booted from PXE, and an error is reported during booting.

Key Process and Cause Analysis

Analyze the feedback information. The procedure is as follows:

  1. PXE boot failure analysis:

    The error code PXE-E18 indicates that the PXE server response times out. Media failure is detected.

  2. The following measures are taken:
    1. The BIOS settings may have been changed after stateless computing is used. However, before the iBMC logs are obtained, the support engineers cannot determine which settings are changed.
    2. The customer is advised to change the boot mode to legacy on the iBMC WebUI and boot the server from PXE. After the boot mode is changed, the OS can be booted from PXE. The customer requires a method to change the boot mode in batches, and is advised to use stateless computing.

  3. Log analysis:

    Records in operate_log show that multiple BIOS settings are changed. However, the current BIOS settings are not displayed in the logs.

    Check the currentvalue.json file in the appdump\bios folder.

    The boot mode is legacy, and the boot sequence is the hard disk drive, DVD-ROM drive, PXE, and others.

    The error messages and boot sequence show that the first boot device is not PXE, and the long booting time causes the upper-layer PXE server to time out. As a result, an error occurs during the server startup. After PXE is set to the first boot device, the problem is resolved.

Conclusion and Solution

Conclusion:

During PXE booting, the first boot device is the hard disk or DVD-ROM drive. As a result, the communication with the PXE server times out and an error occurs.

Experience

During PXE booting, set PXE as the first boot device.

Note

Introduction to E9000 Stateless Computing

The Tecal E9000 chassis houses management modules (MMs), compute nodes, and switch modules. The MMs monitor and manage the chassis and components, and provide the web user interface (WebUI), command line interface (CLI), and Simple Network Management Protocol (SNMP) for management.

The E9000 single-chassis stateless computing feature simplifies configuration and maintenance of the compute nodes through the MM910.

Configuring the E9000 Single-Chassis Stateless Computing Feature

Stateless computing extracts compute node hardware configurations to form a configuration policy file (profile) so that hardware configurations are separated from hardware. The profile enables offline configuration, migration, remote batch deployment, and hardware data import and export.

The physical configuration for a compute node varies with the deployment service or network access mode. For example, if VM services are deployed, you need to set the network interface card (NIC) to SRIOV mode to improve virtualization computing performance. If high reliable database services are deployed, you need to configure the memory reliability parameters.

Typically, the configuration items for a compute node include the configurations of network, storage, computing, and management.

Network configuration refers to the configuration for network access parameters, including MAC address, virtual NICs, VLAN and quality of service (QoS) for virtual NICs, and remote pre-boot execution environment (PXE) startup.

Storage configuration refers to the configuration for storage network access and local storage parameters. The storage network access parameters include FC or fiber channel over Ethernet (FCoE) WWN and SAN Boot. The local storage parameter refers to the local RAID.

Computing configuration refers to the configuration for computing attribute parameters, including OS startup mode and sequence, memory reliability, availability, serviceability (RAS) configuration, energy conservation, virtualization, and universally unique identifier (UUID).

Management configuration refers to the configuration for compute node management attribute parameters, including intelligent platform management interface (IPMI) behaviors and system serial port.

The MAC address and WWN of a device are globally unique. Device replacement may cause changes to the network, storage, or software configurations. Therefore, resource pools are created for managing these parameters to ensure configuration inheritance. For example, a MAC address resource pool is built, from which the MAC addresses of all compute nodes are obtained. If a server is replaced due to failures, the MAC address is retrieved and automatically used on the new node without changing the network configuration, which simplifies maintenance and reduces workload.

E9000 Single-Chassis Stateless Computing Feature Parameters

BIOS Configuration

Category

Configuration Item

Description

Default Value

Boot

Quick Boot

Specifies whether to enable the quick boot mode. In this mode, the system skips some detection steps to shorten the boot time. The options are as follows:

Disabled (0): disables the quick boot mode.

Enabled (1): enables the quick boot mode.

Enabled

Quiet Boot

Specifies whether to enable the quiet boot mode. In this mode, the system is booted by using text. The options are as follows:

Disabled (0): disables the quiet boot mode.

Enabled (1): enables the quiet boot mode.

Disabled

PXE Only

The OS boots only from PXE.

Disable (0): disables the PXE-only boot mode.

Enable (1): enables the PXE-only boot mode.

Disable

Boot Sequence

Boot order.

-

Advance Processor

Power Policy Select

Specifies the system energy efficiency policy. The options are as follows:

Efficient (0): This policy saves system power.

Performance (1): This policy ensures system performance.

Custom (2): This policy strikes a balance between power saving and system performance.

Custom

Turbo Mode

Specifies a CPU acceleration mode, which enables the CPU running frequency to be higher than the nominal frequency. The options are as follows:

Enable (1): enables the Turbo mode.

Disable (0): disables the Turbo mode.

Enabled

Intel HT Technology

Specifies whether to enable the Intel Hyper Threading (HT) technology. This technology enhances CPU performance by increasing the number of CPU core threads. The options are as follows:

Enable (1): enables the CPU HT function.

Disable (0): disables the CPU HT function.

Enabled

EIST Support

Specifies whether to enable the Enhanced Intel SpeedStep Technology (EIST). When the CPU usage is low, the EIST dynamically reduces the operating frequency of the CPU to minimize system power consumption and heat. When the CPU usage is high, the EIST immediately restores the operating frequency of the CPU to its original value. The options are as follows:

Enable (1): enables the EIST function.

Disable (0): disables the EIST function.

Enabled

Power Saving

Specifies whether to enable the CPU P-state adjustment function. This function reduces power consumption by changing the CPU P-states. The options are as follows:

Enable (1): enables the CPU P-state adjustment function.

Disable (0): disables the CPU P-state adjustment function.

Disabled

P State Domain

Sets the P-state domain. The options are as follows:

Per Logical (0): sets P-states by CPU.

Per Package (1): sets P-states by CPU core.

Per Logical

C-States

Specifies whether to enable the CPU C-state function. C-state is a deep power down technology. C3, C6, and C7 indicate energy-efficiency effect and CPU recovery time in ascending order. The options are as follows:

OS ACPI Cx: The OS refers to certain advanced configuration and power interface (ACPI) Cx state and instructs the CPU to enter the C-state. ACPI-C2(0) ;ACPI-C3(1)

Enhanced C-State: enables the P-states to change with the C-states.

Enable C3: closes all internal CPU clocks, including the bus interface (BI) and APIC.

Enable C6: reduces the processor voltage to 0.

Enable C7: retains only the last threading and refresh the remaining last level cache (LLC).

Enabled

Enhanced C-State

Same as the preceding description.

Enable (1): enables the enhanced C-state function.

Disable (0): disables the enhanced C-state function.

Enabled

Enable C3

Same as the preceding description.

Enable (1): enables the support for the C3 state.

Disable (0): disables the support for the C3 state.

Disable

Enable C6

Same as the preceding description.

Enable: enables the support for the C6 state.

Disable: disables the support for the C6 state.

Disable

Enable C7

Same as the preceding description.

Enable: enables the C7 State function.

Disable: disables the C7 State function.

Disable

Memory

Memory RAS(Mirror, Lockstep,Sparing )

Memory RAS features:

Independent: The memory channels are independent of each other.

Mirror: The memory mirroring mode has half of the total memory capacity.

Lockstep: The Lockstep mode is used, improving memory reliability but affecting memory performance.

Rank Sparing: The rank sparing mode has the memory capacity that equals the total memory capacity of one channel minus the memory capacity of a rank.

Independent

NUMA

Specifies whether to enable the non-uniform memory access (NUMA) technology to improve memory access performance for different CPUs.

Disable: disables the NUMA function.

Enable: enables the NUMA function.

Enable

Virtual

SR-IOV

Specifies whether to enable the Single Root I/O Virtualization (SR-IOV) technology. The SR-IOV technology virtualizes one physical PCIe device to form multiple logical, independent PCIe devices. The options are as follows:

Disabled: disables the SR-IOV technology.

Enabled: enables the SR-IOV technology.

Enabled

VT-D

Specifies whether to enable the virtualization technology (VT) for directed I/O. The options are as follows:

Disabled: disables VT for directed I/O.

Enabled: enables VT for directed I/O.

Enabled

Interrupt Remap

Specifies whether to enable the interrupt remap function.

Disable: disables the interrupt remap function.

Enable: enables the interrupt remap function.

Disable

Coherency Support

Specifies whether to enable the coherency support function.

Disable: disables the coherency support function.

Enable: enables the coherency support function.

Disable

ATS Support

Specifies whether to enable the ATS mechanism. The ATS mechanism is provided by the PCIe bus and implemented by the PCIe device. When a PCIe device sends a transaction layers packages (TLPs) in address route mode, the address is converted into a host physical address (HPA), thereby relieving VT-d workload. In addition, the ATS prevents mutual impact between devices in different domains.

Disable: disables the ATS function.

Enable: enables the ATS function.

Enable

Pass Through DMA Support

Disable: disables the pass through DMA function.

Enable: enables the pass through DMA function.

Enable

System

Resume Ac On Power Loss

Specifies the power state after the AC power supply is lost and then recovered.

Power Off (0): remains the power-off state after the AC power supply is recovered.

Last State (1): restores the previous power state after the AC power supply is recovered.

Power On (2): starts after the AC power supply is recovered.

Power Off

BMC WDT Support For POST

Specifies whether to enable the watchdog during the power-on self-test (POST) process. The options are as follows:

Disabled (0): disables the watchdog.

Enabled (1): enables the watchdog.

After the watchdog is enabled, the following parameters are displayed:

BMC WDT Time Out For POST: Specifies the watchdog timeout period in the POST phase.

BMC WDT Action For POST: Specifies the action taken by the watchdog if a timeout error occurs in the POST phase.

Disabled

BMC WDT Action For POST

Specifies the action taken by the watchdog if a timeout error occurs during the POST process. The options are as follows:

No Action (0): The watchdog takes no action.

Hard Reset (1): The system is reset forcibly.

Power Down (2): The system is powered off.

Power Cycle (3): The system is powered off and restarts.

Hard Reset

BMC WDT Time Out For POST

Specifies the watchdog timeout period during the POST process. The value ranges from 4 minutes to 8 minutes. (4-8)

5

BMC WDT Support For OS

Specifies the setting of the watchdog timer for the OS startup. The options are as follows:

Disabled (0): disables the watchdog.

Enabled (1): enables the watchdog.

After the watchdog is enabled, the following parameters are displayed:

BMC WDT Time Out For OS: specifies the watchdog timeout period during OS startup.

BMC WDT Action For OS: specifies the action taken by the watchdog if a timeout error occurs during OS startup.

Disabled

BMC WDT Action For OS

Specifies the action taken by the watchdog if a timeout error occurs during OS startup. The options are as follows:

No Action (0): The watchdog takes no action.

Hard Reset (1): The system is reset forcibly.

Power Down (2): The system is powered off.

Power Cycle (3): The system is powered off and restarts.

Hard Reset

BMC WDT Time Out For OS

Specifies the watchdog timeout period during the OS startup. The value ranges from 2 minutes to 8 minutes. (2-8)

5

Console Serial Redirect

Setting of the serial port redirection function. If it is enabled, data over a physical or virtual serial port can be redirected to the specified system serial port. The options are as follows:

Enable (1): enables the serial redirect function.

Disable (0): disables the serial redirect function.

Enable

-

Baud Rate

Specifies the baud rate of the serial port, indicating the number of bits transmitted per second.

115200 (7)

57600 (6)

38400 (5)

19200 (4)

9600 (3)

4800 (2)

2400 (1)

1200 (0)

115200

NIC configuration:

No.

Configuration Item Name

Configuration Item Value

Description

1

SRIOVState

Enabled/Disabled

The parameter and the multi-channel function are mutually exclusive. The parameter is available only when the multi-channel function is disabled.

2

UMCState

Enabled/Disabled

Enables or disables the multi-channel function.

3

BootEnable

Enabled/Disabled

Enables or disables network booting.

4

PFType

NIC/FCoE/ iSCSI

The parameter is used only for setting PF2/PF3.

5

MinBandwidth

1-100

The parameter is available only when the multi-channel function is enabled.

6

MaxBandwidth

0-100

The parameter is available only when the multi-channel function is enabled.

7

PFVlanID

0002-4094

The parameter is available only when the multi-channel function is enabled.

8

PXEVlan

Enabled/Disabled

The parameter is available only when PFType is set to NIC.

9

PXEVlanID

0002-4094

The parameter is available only when PFType is set to NIC.

10

PXEVlanPriority

0-7

The parameter is available only when PFType is set to NIC.

11

WWNN

1000xxxxxxxxxxxx

The parameter is available only when PFType is set to FCoE. The value of the parameter is lost at power-off failures.

12

WWPN

2000xxxxxxxxxxxx

The parameter is available only when PFType is set to FCoE. The value of the parameter is lost at power-off failures.

13

MAC

000000000000-FFFFFFFFFFFF

The 48-bit MAC address is lost at power-off failures.

NOTE:
  • The parameters vary according to NIC types.
  • After multi-channel is enabled, multiple virtual NICs (physical functions) can share one physical NIC channel. The virtual NICs (physical functions) share one physical link. This feature requires that the NIC supports multi-channel.

Server Uniqueness Configuration

The servers are identified by using the universally unique identifier (UUID).

Implementation Principle

On an E9000 server, stateless computing is controlled by the MM910 management module. The MM910 works with the compute node BMC, BIOS, NIC, and other components in stateless computing. The compute node configurations (consist of setting items) are saved in profiles on the management module. When a compute node is inserted into the chassis, the management module delivers the profile to the compute node. This operation uses the mapping between the profiles and compute node slots. After the profile is delivered, the compute node configuration is synchronized with the management module.

The compute node profiles are saved in the management module and can be imported and exported. The active and standby management modules automatically synchronize with each other. When the active and standby management modules are switched over, the profiles are not affected.

E9000 Single-Chassis Stateless Computing Hardware Requirements

All E9000 compute nodes support stateless computing. The NIC models that support stateless computing are MZ510, MZ512, and MZ910. The MZ510 and MZ512 support multi-channel. The MZ910 does not support multi-channel.

Features of E9000 Stateless Computing

Offline profile configuring:

When the compute node is not in position, you can import or generate a profile. After the compute node is inserted into the chassis, the profile takes effect immediately.

The profiles are separated into common and customized profiles. Common profiles can be used by multiple compute nodes. Customized profiles include private policies such as MAC addresses. A customized profile is associated with a slot and is exclusively used by the compute node in the slot.

Unique resources such as MAC addresses and WWN numbers are stored in a resource pool. The profile must obtain such resources from the resource pool. WWN numbers and MAC addresses share one resource pool. A WWN number is generated by adding a 1000 or 2000 prefix to the MAC address.

Profile import:

You can import a profile from an in-position compute node or an external source.

Import a profile from an in-position compute node.

Import a profile from an external PC.

Profile operations:

You can manually create a profile.

The created profile can be edited, deleted, copied, exported, or delivered.

Edit: Modify the existing configuration items.

Delete: Delete the profile.

Copy: Copy the profile. You can edit the copied profile and generate a new profile.

Export: Export the profile to a PC.

Deliver: Deliver the profile to the compute node in the associated slot. This operation takes effect immediately and resets the compute node.

Copy

Deliver

Profile associating:

After a profile is imported or manually created, you need to configure the association between the profile and the slots. When a compute node is installed, the profile is automatically delivered. If the profile does not match the compute node hardware, the system generates an alarm.

A common profile can be associated with multiple slots. A customized profile can be associated with only one slot.

Profile migration:

A profile can be migrated from one slot to another. If a virtual link is configured, the corresponding ports of the switch modules are also migrated along with the profile. After the migration, the configurations are deleted from the source slot and take effect on the destination slot.

Translation
Download
Updated: 2019-02-25

Document ID: EDOC1000041338

Views: 69826

Downloads: 3773

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next