No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionStorage V100R006C10 Block Storage Service Parts Replacement 06

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Replacing the Hard Disk or SSD Device Used As the Metadata Disk

Replacing the Hard Disk or SSD Device Used As the Metadata Disk

Scenarios

Replace the faulty hard disk or solid state disk (SSD) device that is used as the metadata disk on the server. If multiple metadata disks need to be replaced, repeatedly perform operations in this section to replace the faulty devices one by one.

When a metadata disk is faulty, the ALM-51806 Faulty Metadata Disk and ALM-51804 Abnormal ZooKeeper Process alarms will be generated in the FusionStorage system.

Impact on the System

This operation has no adverse impacts on the system.

Prerequisites

Conditions

  • The spare parts of the same model and specifications as the faulty component are available.
  • The server to be replaced has been identified with a label on its front panel.
  • The FusionStorage storage pool is in the normal state and no reconstruction task is running before disk replacement.

    You can check the pool state and reconstruction tasks as follows: On the FusionStorage Block self-service maintenance platform, choose Resource Pool > Storage Pools and view the status of desired storage pools and reconstruction tasks. For example, Figure 2-10 shows that the storage pool is in the normal state and one reconstruction task is running with a progress of 57%.

    Figure 2-10  Checking the pool state and reconstruction tasks

Data

Category Name Description Example
FusionStorage system information Login address Used for querying the disk topology of the faulty server, placing the server into maintenance mode, or removing the server from maintenance mode. http://192.168.40.10:28443/fsportal
Administrator username and password

admin/IaaS@PORTAL-CLOUD8!

Baseboard management controller (BMC) information of the faulty server BMC IP address Used for logging in to the server BMC system to power off the server. 192.168.30.7
‏Password for user root Huawei12#$

Procedure

    Replace the faulty device.

    1. Perform the required operation based on whether the faulty device needs to be powered off.

      Table 2-3 shows the methods of powering off devices of different storage media.
      Table 2-3  Methods of powering off devices

      Device Medium Type

      Disk Type Displayed on FusionStorage Block Self-Maintenance Platform

      Power Off Method

      SAS disks, SATA disks, and SSDs

      SAS Disk/SATA Disk/SSD Disk

      Do not need to be powered off.

      PCIE SSD cards (non-NVMe protocol)

      SSD Card/NVMe SSD

      Power off the server.

      NVMe SSD cards

      SSD Card/NVMe SSD

      Power off the server.

      NVMe SSDs

      SSD Card/NVMe SSD

      Perform a logical power-off.

      To view the metadata disk types, choose Resource Pool > Summary on the FusionStorage Block Self-Maintenance Platform and view the Metadata Storage Location values in the control cluster area, as shown in Figure 2-11.
      Figure 2-11  Metadata disk types
      • If the faulty device can be replaced without a power-off, replace it simply.
      • If the faulty device is an NVMe SSD, you do not need to power off the server but need to perform a logic power-off for the NVMe SSD. The operations are as follows:
        1. On the server, run the cat /opt/dsware/agent/conf/zk_slot_info command as user root to obtain the zk_esn value.
        2. Run the cat /proc/smio_host command as user root to obtain the Location value, that is, the slot number of the NVMe SSD using zk_esn.

          For example, if the Location value is 0:11, the slot number is 11.

        3. Run the /opt/dsware/agent/script/dsware_agent_handle.sh nvme_power_operation power_off slot_num command as user root to perform the logic power-off for the NVMe SSD.
          • Parameter slot_num specifies the slot number of the to-be-replaced NVMe SSD.
          • If the following information is displayed, the logical power-off is successful, and you can replace the faulty device: (If other information is displayed, the to-be-replaced device does not support the logical power-off. You need to power off the server and then replace the faulty device.)
            result=0;value=no;
      • If the faulty device can be replaced only after the server is powered off, perform the required operation based on the deployment scenario of FusionStorage.

        NOTE:

        If the server needs to be powered off before you replace the faulty component on a storage server, perform operations provided in Placing a Storage Node into Maintenance Mode before the replacement and perform operations provided in Removing a Storage Node from Maintenance Mode after the replacement.

        If a server is placed into maintenance mode, the timeout duration in which the server is removed will be prolonged by 45 minutes. Therefore, you will have 75 minutes to replace the faulty parts. If a server is removed from the storage pool for more than 75 minutes, the storage pool will reconstruct data.

        • If FusionStorage is deployed in the FusionSphere system, replace the faulty server by performing operations provided in Parts Replacement in the FusionSphere product documentation.

        • If FusionStorage is deployed in the Server SAN system, take measures to protect running services, power off the faulty server, and replace it with a new one.

    Restore the metadata disk.

    1. In the CLI of the active FusionStorage Manager (FSM) node you have logged in to as user dsware, run the required command to restore the metadata disk:

      NOTE:

      Since the system has been hardened, you need to enter the username and password for login authentication after running the dswareTool command of FusionStorage Block. The default username is cmdadmin, and its default password is IaaS@PORTAL-CLOUD9!.

      The system supports authentication using environment variables so that you do not need to repeatedly enter the username and password for authentication each time you run the dswareTool command. For details, see Authentication Using Environment Variables.

      sh /opt/dsware/client/bin/dswareTool.sh --op restoreControlNode -ip Management IP address of the host accommodating the new storage device -zkDiskSlot Slot number of the metadata disk

      In this command, zkDiskSlot specifies the slot number of the new metadata disk. It is required if the slot number of the metadata disk is changed.

    2. On the FusionStorage Block Self-Maintenance Platform, choose Resource Pool > Summary and view the statuses of ZooKeeper processes in the Process status area.

      If the statuses of ZooKeeper processes are normal, the replacement is successful.

Translation
Download
Updated: 2019-02-01

Document ID: EDOC1000175242

Views: 11436

Downloads: 1

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next