Data Replication
Data replication is a process of writing service data generated by hosts to the secondary LUNs in the secondary storage system for backup and DR. The writing process varies depending on the remote replication mode. This section describes data replication performed in synchronous and asynchronous remote replication modes.
Writing Process in Synchronous Remote Replication
Synchronous remote replication replicates data in real time from the primary storage system to the secondary storage system. The characteristics of synchronous remote replication are as follows:
- After receiving a write I/O request from a host, the primary storage system sends the request to the primary and secondary LUNs.
- The data write result is returned to the host only after the data is written to both primary and secondary LUNs. If data fails to be written to the primary LUN or secondary LUN, the primary LUN or secondary LUN returns a write I/O failure to the remote replication management module. Then, the remote replication management module changes the mode from dual-write to single-write, and the remote replication pair is interrupted. In this case, the data write result is determined by whether the data is successfully written to the primary LUN and is irrelevant to the secondary LUN.
After a synchronous remote replication pair is created between a primary LUN and a secondary LUN, you need to manually perform synchronization so that data on the two LUNs is consistent. Every time a host writes data to the primary storage system after synchronization, the data is copied from the primary LUN to the secondary LUN of the secondary storage system in real time.
The specific process is as follows:
- Initial synchronizationAfter a remote replication pair is created between a primary LUN on the primary storage system at the production site and a secondary LUN on the secondary storage system at the DR site, initial synchronization is started.
- All data on the primary LUN is copied to the secondary LUN.
- During initial synchronization, if the primary LUN receives a write request from a host and data is written to the primary LUN, the data is also written to the secondary LUN.
- Dual-write
After initial synchronization is complete, the data on the primary LUN is the same as that on the secondary LUN. Then an I/O request is processed as follows:
Figure 1-2 shows how synchronous remote replication processes a write I/O request.
- The primary storage system at the production site receives the write request. HyperReplication records the write request in a log. The log contains the address information instead of the specific data.
- The write request is written to both the primary and secondary LUNs. Generally, the LUNs are in write back status. The data is written to the primary cache and secondary cache.
- HyperReplication waits for the primary and secondary LUNs to return the write result. If data write to the secondary LUN times out or fails, the remote replication pair between the primary and secondary LUNs is interrupted. If data write succeeds, the log is cleared. Otherwise, the log is stored in the DCL, and the remote replication pair is interrupted. In the follow-up data synchronization, the data block to which the address of the log corresponds will be synchronized.
- HyperReplication returns the data write result to the host. The data write result of the primary LUN prevails.
LOG: data write log
DCL: data change log
The DCL is stored on all disks and all DCL data has three copies for protection. Storage system logs are stored on coffer disks.
Writing Process in Asynchronous Remote Replication
Asynchronous remote replication periodically replicates data from the primary storage system to the secondary storage system. The characteristics of asynchronous remote replication are as follows:
- Asynchronous remote replication relies on the snapshot technology. A snapshot is a point-in-time copy of source data.
- When a host successfully writes data to a primary LUN, the primary storage system returns a response to the host declaring the successful write.
- Data synchronization is triggered manually or automatically at preset intervals to ensure data consistency between the primary and secondary LUNs.
HyperReplication in asynchronous mode adopts the multi-time-segment caching technology. The working principle of the technology is as follows:
A multi-time-segment indicates a serial number of data that is written to a LUN in a period. The data is then written to disks. The multi-time-segment can be used to distinguish such data and snapshot data.
- After an asynchronous remote replication relationship is set up between primary and secondary LUNs, the initial synchronization begins by default. The initial synchronization copies all data from the primary LUN to the secondary LUN to ensure data consistency.
- After the initial synchronization is complete, the secondary LUN data status becomes consistent (data on the secondary LUN is a copy of data on the primary LUN at a certain past point in time). Then the I/O process shown in the following figure starts. Figure 1-3 shows the writing process in asynchronous remote replication mode.
- When a synchronization task starts in asynchronous remote replication, a snapshot is generated on both the primary and secondary LUNs (snapshot X on the primary LUN and snapshot X - 1 on the secondary LUN), and the point in time is updated.
- New data from the host is stored to the cache of the primary LUN at the X + 1 point in time.
- A response is returned to the host, indicating that the data write is complete.
When receiving a write request from a host, the primary storage system sends the data to the primary LUN. As soon as the primary LUN returns a write success result, the primary storage system returns the write success result to the host. When the synchronization period is reached, the data is synchronized to the secondary LUN.
- The differential data that is stored on the primary LUN at the X point in time is copied to the secondary LUN based on the DCL.
- The primary LUN and secondary LUN store the data that they have received to disks. After data is synchronized from the primary LUN to the secondary LUN, the latest data on the secondary LUN is a full copy of data on the primary LUN at the X point in time.
The DCL is stored on all disks and all DCL data has three copies for protection. Storage system logs are stored on coffer disks.
When synchronization is started (synchronization is manually started or automatically triggered when the synchronization period reaches), snapshots of the primary and secondary LUNs are generated and activated. The functions of snapshots are as follows:
- Primary LUN snapshot
Ensures that data read from the primary LUN during data synchronization is always consistent and allows simultaneous implementation of data synchronization and data write to the primary LUN.
- Secondary LUN snapshot
Stores backup data for the data on the secondary LUN before synchronization so that data on the secondary LUN is still usable even if a failure occurs during synchronization.
The snapshot function is used only during data synchronization. After data synchronization is complete, the snapshot function stops to release capacity reserved for snapshots, thereby reducing system overhead and improving performance.