No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionStorage V100R006C10 Block Storage Service Parts Replacement 06

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Replacing the SSD Device Used As Caches

Replacing the SSD Device Used As Caches

Scenarios

Replace the faulty solid state disk (SSD) device that is used as the caches on the server.

Impact on the System

This operation has no adverse impacts on the system.

Prerequisites

Conditions

  • The spare parts of the same model and specifications as the faulty component are available.
  • The server to be replaced has been identified with a label on its front panel.
  • The FusionStorage storage pool is in the normal state and no reconstruction task is running before disk replacement.

    You can check the pool state and reconstruction tasks as follows: On the FusionStorage Block self-service maintenance platform, choose Resource Pool > Storage Pools and view the status of desired storage pools and reconstruction tasks. For example, Figure 2-7 shows that the storage pool is in the normal state and one reconstruction task is running with a progress of 57%.

    Figure 2-7  Checking the pool state and reconstruction tasks

Data

Category Name Description Example
FusionStorage system information Login address Used for querying the disk topology of the faulty server, placing the server into maintenance mode, or removing the server from maintenance mode. http://192.168.40.10:28443/fsportal
Administrator username and password

admin/IaaS@PORTAL-CLOUD8!

Baseboard management controller (BMC) information of the faulty server BMC IP address Used for logging in to the server BMC system to power off the server. 192.168.30.7
‏Password for user root Huawei12#$

Procedure

    Determine the status of the faulty device.

    1. On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (SSD device used as caches).

      You can perform the following operations to query the troubleshooting method. Click the faulty device and click Query Troubleshooting Method displayed. In the displayed dialog box, click OK. Click the faulty device again, and the recommended troubleshooting method will be displayed. Figure 2-8 shows the key operation steps.
      Figure 2-8  Querying the troubleshooting method recommended by the system
      • If the device is not removed from the storage pool and Repair is displayed on the page, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.
      • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is generated, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.
      • If the device is not removed from the storage pool and You need to forcibly replace the medium to rectify the fault is displayed on the page, go to 2. The ignoreMediaFault value is set to false.
      • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is not generated, go to 2. The ignoreMediaFault value is set to true.
      • If the device has been removed from the storage pool and Medium status error. Change the medium. is displayed on the page, go to 2.

    Replace the faulty device.

    1. Perform the required operation based on whether the faulty device needs to be powered off.

      Table 2-2 shows the methods of powering off devices of different storage media.
      Table 2-2  Methods of powering off devices

      Device Medium Type

      Disk Type Displayed on FusionStorage Block Self-Maintenance Platform

      Power Off Method

      SAS disks, SATA disks, and SSDs

      SAS Disk/SATA Disk/SSD Disk

      Do not need to be powered off.

      PCIE SSD cards (non-NVMe protocol)

      SSD Card/NVMe SSD

      Power off the server.

      NVMe SSD cards

      SSD Card/NVMe SSD

      Power off the server.

      NVMe SSDs

      SSD Card/NVMe SSD

      Perform a logical power-off.

      If the SSD card or SSD is to be replaced, take note of its electronic serial number (ESN) displayed on the FusionStorage Block Self-Maintenance Platform before replacement.

      If disks added to a FusionStorage storage pool are required to form redundant array of independent disks (RAID) 0, perform operations provided in the server documentation. If disks in RAID 0 are hot-swapped, manually activate RAID 0 to add the disks to the system. Otherwise, the disks cannot be identified by the system. For details, see the server documentation.

      • If the faulty device can be replaced without a power-off, replace it simply.
      • If the faulty device is an NVMe SSD, you do not need to power off the server. However, you need to perform the following operations to logically power off the faulty NVMe SSD for replacement:
        1. On FusionStorage Block Self-Maintenance Platform, choose Hardware > Disks.
        2. Locate the row that contains the NVMe SSD to be replaced and click Power Off, as shown in Figure 2-9.
          Figure 2-9  Logical power-off
        3. Click Yes.
          NOTE:
          If the logical power-off fails, power off the server and then replace the faulty device.
      • If the faulty device can be replaced only after the server is powered off, perform the required operation based on the deployment scenario of FusionStorage.

        NOTE:
        • If partial SSD cards on the server that provides storage resources for FusionStorage become faulty and need to be replaced, place the server into maintenance mode before the replacement and remove the server out of maintenance mode after the replacement. For details, see Placing a Storage Node into Maintenance Mode and Removing a Storage Node from Maintenance Mode.
        • If all SSD cards on the server that provides storage resources for FusionStorage become faulty, you do not need to place the server into maintenance mode.
        • If a server is placed into maintenance mode, the timeout duration in which the server is removed will be prolonged by 45 minutes. Therefore, you will have 75 minutes to replace the faulty parts. If a server is removed from the storage pool for more than 75 minutes, the storage pool will reconstruct data.

        • If FusionStorage is deployed in the FusionSphere system, replace the faulty server by performing operations provided in Parts Replacement in the FusionSphere product documentation.

        • If FusionStorage is deployed in the Server SAN system, take measures to protect running services, power off the faulty server, and replace it with a new one.

    Determine the status of the hardware device.

    1. On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (SSD device used as caches).

      • If the replaced device is not an NVMe device and has been automatically added to the storage pool and runs properly, no further action is required.
      • If the replaced device is an NVMe device and has been automatically added to the storage pool and runs properly, go to 5.
      • If the device has been removed from the storage pool:

        If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.

        • SSD card: Click Replace the SSD card, select a new or idle storage device on the server, and add it to the storage pool.
        • SSD: Click Replace the SSD disk, select a new or idle storage device on the server, and add it to the storage pool.
      • If the faulty device has not been removed from the storage pool, go to 4.

    Restore storage resources.

    1. In the CLI of the active FusionStorage Manager (FSM) node you have logged in to as user dsware, run the required command based on the cache type to restore storage resources:

      NOTE:

      Since the system has been hardened, you need to enter the username and password for login authentication after running the dswareTool command of FusionStorage Block. The default username is cmdadmin, and its default password is IaaS@PORTAL-CLOUD9!.

      The system supports authentication using environment variables so that you do not need to repeatedly enter the username and password for authentication each time you run the dswareTool command. For details, see Authentication Using Environment Variables.

      To restore an SSD device, run the following command: sh /opt/dsware/client/bin/dswareTool.sh --op forceReplaceSSD -id Storage pool ID -oldEsn ESN of the faulty SSD device -newEsn ESN of the new SSD device -nodeMgrIp Management IP address -type cache -ignoreMediaFault true/false

      If the replaced device is not an NVMe device, no further action is required.

    Enable hardware DIF.

    If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool.

    1. Remove the storage node.

      If the new device is automatically added to the storage pool, you need to remove the node housing the new device from the storage pool and then enable hardware DIF. For details, see Storage Pool Capacity Reduction.

    2. Enable hardware DIF.

      After the target storage node is removed from the storage pool, enable hardware DIF. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.

    3. Add the storage node.

      After hardware DIF is enabled, add the node to the storage pool again. For details, see Expanding Capacity of a Storage Pool.

Translation
Download
Updated: 2019-02-01

Document ID: EDOC1000175242

Views: 10732

Downloads: 1

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next