No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Basic Storage Service Configuration Guide for Block

OceanStor Dorado V3 Series V300R002

This document is applicable to OceanStor Dorado3000 V3, Dorado5000 V3, Dorado6000 V3, and Dorado18000 V3. It describes the basic storage services and explains how to configure and manage them.
Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Basic Storage Principles

Basic Storage Principles

OceanStor Dorado V3 uses the RAID 2.0+ and dynamic RAID technologies to achieve dynamic allocation and expansion of storage resources in storage pools.

Basic Concepts

  • Block virtualization

    A new type of RAID technology. Block virtualization divides disks into multiple chunks (CKs) of a fixed size and organizes them into multiple chunk groups (CKGs). When a disk fails, the disks of the CKG where the CKs in the faulty disk reside also participate in reconstruction. This significantly increases the number of disks involved in the reconstruction, improving the data reconstruction speed.

  • CK

    Disks are divided into blocks. Each block is assigned a number and constitutes a CK. A CK is the smallest unit of a RAID group.

  • CKG

    A logical collection of N+M CKs on different disks. N is the number of data blocks in a CKG and changes with the number of disks involved. M is the number of parity blocks in a CKG. M is a fixed value and depends on the data type. The default value of M is 3 for a metadata CKG and 2 for a data CKG. A CKG has the properties of a RAID group.

  • Zone

    A collection of CKGs and used for CKG management. Different zones have different RAID properties, usages (data, log, metadata), and I/O write modes.

  • Grain

    CKGs are divided into small, fixed-size blocks called grains (the default size of a grain is 8 KB). Grains are the basic units that constitute a thin LUN. Logical Block Addresses (LBAs) in a grain are consecutive.

  • Dynamic RAID

    A new RAID algorithm that dynamically adjusts the number of CKs in CKGs to ensure system reliability and storage capacity. If a CK is faulty and no CK is available from disks outside the disk domain, the system dynamically reconstructs the original N+M CKs to (N-1)+M CKs. When an extra SSD is inserted, the system then migrates data from (N-1)+M CKs to the newly constructed N+M CKs for efficient disk usage.

  • Disk domain

    A collection of multiple disks.

  • Storage pool

    A storage resource container, which is created under a disk domain. The storage resources used by application servers are from storage pools.

  • Hot spare space

    Space used for faulty block data reconstruction in block virtualization. When a CK is faulty, the system lets a CK of the hot spare space take over and instructs other CKs in the CKG to perform data reconstruction using the hot spare space. This ensures data integrity and read/write performance.

  • Reconstruction

    A process of restoring the data saved on a faulty disk to hot spare CKs and replacing the CKs on the faulty disk with the hot spare CKs. During data reconstruction, valid data and parity data must be read and processed to restore the data saved on a faulty disk to hot spare space, thereby ensuring data security and reliability. Traditional reconstruction technologies allow only all disks in the same RAID group as the faulty disk to participate in reconstruction. The RAID 2.0+ technology enables all disks in the same disk domain as the faulty disk to participate in reconstruction, boosting data reconstruction speed and shortening data recovery duration.

  • Deduplication

    Data reduction technology that deletes duplicate data in a storage system to reduce the capacity required for storing data.

  • Data compression

    Compresses data without causing data loss, improving the efficiency in data storage, transfer, and processing.

Block Virtualization Process

Figure 2-1 shows the block virtualization process.

Figure 2-1 Block virtualization process

  1. Multiple disks form a disk domain.
  2. The storage system divides the storage media in a disk domain into fixed-size CKs.
  3. CKs are configured into CKGs and hot spare space based on the RAID policy and hot spare policy specified on DeviceManager.
  4. A storage pool is composed of multiple CKGs. The storage system dynamically adjusts the number of data blocks in CKGs based on the dynamic RAID algorithm, improving space usage and reliability.
  5. When a user is creating LUNs, grains are mapped to LUNs for more precise management over the storage resources.

Mapping Table

Each controller maintains a mapping table, which stores the mapping relationship between user information and data storage locations. The mapping table contains the following contents:

  • Mapping between volume information (LUN ID + version) and the fingerprint index (deduplication enabled) or grain address (deduplication disabled).
  • Mapping between the fingerprint index and grain address (deduplication enabled).
  • Mapping between the CK address and the physical location in the SSD.

Figure 2-2 shows an example of a mapping table.

Figure 2-2 Example mapping table

Write I/O Processing Principles (Deduplication and Compression Enabled by Default)

During I/O write operations, data streams from the host to the storage system are divided into multiple data blocks. The storage system uses a unique identifier to collect the data fingerprint of each data block.

The storage system maintains a fingerprint table to record the existence and storage location of the data. Mappings of volume information, fingerprints, and grains are recorded in metadata.

The storage system checks whether the fingerprint and the data block exist when new data is written into the system.

  • If the fingerprint does not exist, the system performs the following operations:
    1. Compresses data if compression is enabled.
    2. Selects a grain location to store the data block based on the fingerprint.
    3. Records the mapping information between the fingerprint and the grain.
    4. Increases the reference count of the fingerprint by one.
    5. Writes the data to the physical location of the SSD according to the grain location information.
  • If the fingerprint exists, the system checks the data block byte by byte. After determining that the data already exists, the system records the volume information of the data to the fingerprint mapping and increases the reference count of the fingerprint by one.

Figure 2-3 shows how write I/Os are processed.

Figure 2-3 Write I/O processing

Upon receiving data from a host, the storage system performs the following operations:

  1. Analyzes and segments the data into data blocks of a fixed size.
  2. Records the volume information of each data block and assigns a fingerprint to each data block.
  3. Determines whether the data is duplicate based on the fingerprint.
    • If the data is duplicate, the system performs deduplication.
    • If the data is unique, the system records the mapping between the fingerprint and the grain of the data.
  4. Increases the reference count of the fingerprint by one.
  5. Writes the data block to the grain.
  6. Returns a write success message to the host.
  7. Writes the data to the physical location on the SSD.

Read I/O Processing Principles

During I/O read operations, the system searches for the logical address of the desired data. Once the system finds the fingerprint, it checks the mapping between the fingerprint and the grain. Based on the grain, it searches for the data block in the physical location on the SSD.

Reconstruction

Dynamic RAID reconstructs data by CK. When a CK in a CKG fails, the system checks whether a new CK from a non-RAID-member disk is available.

  • If yes, the system reconstructs the data on the failed CK and writes it to the new CK. In this way, the system retains the N+M blocks of the RAID level.
  • If not, the system regroups the CKG into (N-1)+M blocks, reconstructs the data on the failed CK, and writes the data to the new CKG.

Figure 2-4 and Figure 2-5 show the reconstruction process. In this example, N = 10, M = 2. The process is similar in other cases.

Figure 2-4 Reconstruction process in scenarios where the system can choose a CK from another disk

Figure 2-5 Reconstruction process in scenarios where the system cannot choose a CK from another disk

The main idea of dynamic RAID reconstruction is to reduce the number of RAID data blocks when a CK fails and no new CK is available. This prevents degradation of the RAID level (from 10+2 to 9+2 in the preceding example).

Global Garbage Collection

All the data and metadata in a storage pool are written into data blocks using redirect-on-write (ROW). The system writes data to new storage locations and marks data in the original locations as junk data. During system running, the system keeps overwriting data and modifying metadata, generating a great number of space fragments. Garbage must be collected in a timely manner to release space for writing service data.

Garbage is distributed unevenly and randomly among CKs in CKGs. Different reclaiming policies are adopted for garbage collection based on the size of junk data in CKs.

Figure 2-6 shows the global garbage collection process.

Figure 2-6 Global garbage collection

Scenarios Where the LUN Write Mode Changes to Write Protection

The default write mode of LUNs in a storage system is write back. However, the write mode will change to write protection if any of the faults listed in Table 2-1 occurs.

Table 2-1 Scenarios where the write mode of LUNs changes to write protection

Malfunction Type

Scenario

Impact and Recommended Action

The temperature of a controller exceeds the upper limit.

  • If the Controller Enclosure Temperature Exceeds The Upper Limit alarm is generated due to abnormal equipment room temperature or an internal component exception, LUNs stay in write back mode for 192 hours. If the alarm persists for more than 192 hours, the mode changes to write protection.
  • If the Controller Enclosure Temperature Exceeds The Upper Limit alarm is generated due to failure of a single controller, LUNs stay in write back mode for 1 hour. If the alarm persists for more than 1 hour, the mode changes to write protection.
NOTE:

If the Controller Enclosure Temperature Is Far Beyond The Upper Limit alarm is generated in a storage system, the storage system will automatically power off.

  • Impact

    The write mode of all LUNs belonging to the controller enclosure changes to write protection.

  • Recommended action

    Check the external cooling system, fan modules, and air ducts to find and solve the problem.

Backup battery units (BBUs) on a controller enclosure malfunction.

If two BBUs malfunction and an alarm is generated, the write mode of LUNs changes from write back to write protection.

  • Impact

    The write mode of all LUNs belonging to the controller enclosure changes to write protection.

  • Recommended action
    • Verify that the BBUs are properly installed.
    • Check whether the BBUs break down. If the BBUs break down, replace them.
    • Check whether the BBUs have sufficient power. If the BBUs have insufficient power, wait until the BBUs are fully charged.

The built-in coffer disks of multiple controllers malfunction.

  • Dual-controller storage device: If the built-in coffer disks of both controllers break down, the write mode of LUNs changes from write back to write protection.
  • Four-controller storage device: If all coffer disks of controllers A and B or controllers C and D break down, the write mode of LUNs changes from write back to write protection.
  • Impact

    The write mode of all LUNs belonging to the controller enclosure changes to write protection.

  • Recommended action

    Check whether the built-in coffer disks of controllers are faulty. If the coffer disks are faulty, replace them.

A controller malfunctions.

By default, LUNs stay in write back mode for 192 hours (write back hold time) after a single controller malfunctions. If the fault is not rectified within this period, the write mode of the LUNs changes from write back to write protection.

  • Impact

    The write mode of all LUNs belonging to the controller enclosure changes to write protection if the fault persists for more than 192 hours.

  • Recommended action
    • Replace the faulty controller at off-peak hours within the write back hold time.
    • If a spare part is unavailable during the write back hold time, you can extend the hold time properly after assessing risks to prevent write protection from adversely affecting services.

The remaining capacity of a storage pool is smaller than the reserved capacity.

An alarm is generated, indicating that the capacity usage of a storage pool exceeds the threshold and reminding you to expand the storage pool.

  • Impact

    The write mode of LUNs changes to write protection.

  • Recommended action

    Expand the storage pool.

Translation
Download
Updated: 2019-07-17

Document ID: EDOC1100049139

Views: 27674

Downloads: 209

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next