No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionStorage 8.0.0 Block Storage Parts Replacement 04

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Online Replacing an SSD Used as the Cache

Online Replacing an SSD Used as the Cache

NOTE:

This chapter applies to TaiShan 2280 V2 12-slot nodes, TaiShan 2280 V2 25-slot nodes, TaiShan 5280 V2 36-slot nodes, 2288H V5 12-slot nodes, and 5288 V5 36-slot nodes.

This chapter describes how to replace SSDs used as the cache in scenarios that the SSDs are not faulty but their lifecycles are about to expire or the SSDs have defects. During the replacement, you do not need to stop system services.

Impact on the System

If the system has heavy service loads, the replacement will increase the service response time. Perform the replacement during off-peak hours.

Prerequisites

  • The spare SSD is ready.
  • The SSD to be replaced has been located.
  • The storage pool to which the SSD to be replaced belongs is normal, and no data reconstruction task is running.
NOTE:

For details about the positions of SSDs, see Slot Numbers.

Precautions

  • Wait until the removal of an SSD is complete before you remove another one.
  • Wait until the insertion of an SSD is complete before you install another one.
  • NVMe SSDs support only orderly hot swap. Before removing an NVMe SSD, stop all services accessing the NVMe SSD.
  • When replacing an SSD, wait for 30 seconds after it is removed before you install a new one.

Tools and Materials

  • ESD gloves
  • ESD wrist straps
  • ESD bags
  • Labels

Procedure

  1. Obtain the ESN of the SSD to be replaced.

    1. Log in to DeviceManager, and choose Services > Block Service > Storage Pool.
    2. Click in the row where the desired storage pool resides, and select Disk Topology from the shortcut menu.
    3. Move the cursor to the SSD to be replaced. SN that is displayed is the ESN of the SSD to be replaced.

  2. Log in to the primary management node as user dsware, and run the sh /opt/dsware/client/bin/dswareTool.sh --op setServerStorageMode -ip Management IP address of the faulty node -mode 1 command to switch to the maintenance mode. To run this command, enter the name and password of CLI super administrator account admin as prompted.
  3. Start the storage pool data pre-flush.

    1. Log in to the primary management node as user dsware.
    2. Run the following command to start the data pre-flush. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/client/bin/dswareTool.sh --op startPreflush -id Storage pool ID

    3. Run the following command to query the data pre-flush progress of the storage pool. The data pre-flush is complete when the progress is 100%. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/tools/ops_tool/replace_ssd/query_preflush_process.sh

  4. Perform pre-processing on the node where the SSD is to be replaced.

    1. Log in to the primary management node as user dsware.
    2. Run the following command to perform pre-processing on the node where the SSD is to be replaced. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/tools/ops_tool/replace_ssd/preprocessing_replace_ssd_cache.sh -p Storage pool ID -a Management IP address of the node

  5. Optional: If the SSD to be replaced is an NVMe SSD, log in to the primary management node as user dsware and run the following command to logically power off the NVMe SSD to be replaced. To run this command, enter the name and password of CLI super administrator account admin as prompted:

    sh /opt/dsware/client/bin/dswareTool.sh --op poweroffNvmeDisk -ip Management IP address of the node -slotNo Slot number of the SSD to be replaced -esn ESN of the SSD to be replaced

  6. Remove the disk module.

    Correctly record the slots where disk modules reside. Install disk modules into the same slots before and after the replacement. Otherwise, services may be affected.

    1. Press the button that secures the disk module ejector lever, as shown in step 1 in Figure 11-1.

      The ejector lever automatically ejects.

      Figure 11-1 Removing a disk module
    2. Hold the ejector lever, and pull out the disk module for approximately 3 cm, as shown in step 2 in Figure 11-1.
    3. Wait at least 30 seconds until the disk stops spinning, and slowly pull out the disk module, as shown in step 3 in Figure 11-1.

  7. Place the removed SSD in an ESD bag.
  8. Take the spare SSD out of its ESD bag.
  9. Install the disk module.

    Install disk modules into the same slots before and after the replacement. Otherwise, services may be affected.

    1. Raise the ejector lever and push the disk module in along the guide rails until it does not move, as shown in step 1 in Figure 11-2.
      Figure 11-2 Installing a disk module
    2. Ensure that the ejector lever is fastened to the beam, and lower the ejector lever to completely insert the disk module into the slot, as shown in step 2 in Figure 11-2.

  10. Optional: Add the spare SSD to the storage pool.

    1. Wait for 5 minutes after the replacement, and then obtain the ESN of the spare SSD.
      1. Log in to DeviceManager, and choose Cluster > Hardware.
      2. Locate the faulty node based on the node IP address, and click to display the node details.
      3. Move the cursor to the corresponding slot. SN that is displayed is the ESN of the spare SSD.
    2. Run the following command to add the spare SSD to the storage pool. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/client/bin/dswareTool.sh --op replaceSSDCache -id Storage pool ID -oldEsn ESN of the SSD to be replaced -newEsn ESN of the spare SSD -nodeMgrIp Management IP address of the node -type online

  11. Perform post-processing on the node where the SSD is replaced.

    1. Log in to the primary management node as user dsware.
    2. Run the following command to perform post-processing on the node where the SSD is replaced. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/tools/ops_tool/replace_ssd/postprocessing_replace_ssd_cache.sh -p Storage pool ID -a Management IP address of the node

  12. Cancel the data pre-flush of the storage pool.

    1. Log in to the primary management node as user dsware.
    2. Run the following command to cancel the data pre-flush of the storage pool. To run this command, enter the name and password of CLI super administrator account admin as prompted:

      sh /opt/dsware/client/bin/dswareTool.sh --op stopPreflush -id Storage pool ID

  13. Check the firmware version.

    1. Use the KVM to log in to the storage node as user root.
    2. Run the following command to check the firmware version (nvme0 is used as an example):

      hioadm updatefw -d nvme0

      [root@node0101 ~]# hioadm updatefw -d nvme0
      slot  version   activation
      1     3.10       
      2     3.10      current
      3     3.10 

      The version on the left of current is the current firmware version. If the current firmware version is not 3.10 or later, contact Huawei technical support.

  14. Check the system status.

    On SmartKit, choose Home > Storage > Routine Maintenance > More > Inspection and check the system status.
    • If all inspection items pass the inspection, the inspection is successful.
    • If some inspection items fail, the inspection fails. Rectify the faults by taking recommended actions in the inspection reports. Perform inspection again after fault rectification. If the inspection still fails, contact Huawei technical support.

    For details, see the FusionStorage Block Storage Administrator Guide.

Follow-up Procedure

Label the replaced SSD to facilitate subsequent operations.

Translation
Download
Updated: 2019-09-19

Document ID: EDOC1100081420

Views: 4965

Downloads: 4

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next