硬盘加密密钥丢失导致业务中断,如何进行恢复
问题
当加密存储池的密钥损坏时,如何恢复业务?
回答
硬盘加密密钥丢失后,如果存储系统发生闪断,加密盘获取不到密钥无法正常接入存储系统,会造成存储池故障,业务中断。
此时,可以通过以下操作恢复业务:
某些操作需要在CLI管理界面的developer模式中进行,建议由华为技术工程师执行恢复业务相关的操作。
- 恢复密钥。
恢复密钥的操作请参见如何恢复硬盘加密密钥文件。
- 查看故障盘。
登录CLI管理界面,运行show disk general命令,查看各加密盘的状态。
admin:/>show disk general ID Health Status Running Status Type Capacity Role Disk Domain ID Speed(RPM) Health Mark Bar Code Item AutoLock State Key Expiration Time ------ ------------- -------------- -------- --------- --------- -------------- ---------- ----------- -------------------- -------- -------------- ------------------- DAE000.0 Fault Online SSD SED 561.994GB Member Disk 0 10000 -- 210235G6BB1000000007 0235G6BB ON 2020-12-31 DAE000.1 Fault Online SSD SED 561.994GB Member Disk 0 10000 -- 210235G6BB1000000007 0235G6BB ON 2020-12-31 DAE000.2 Fault Online SSD SED 561.994GB Member Disk 0 10000 -- 210235G6BB1000000007 0235G6BB ON 2020-12-31
当某个硬盘的“AutoLock State”显示为“ON”且“Health Status”显示为“Fault”时,说明该硬盘为故障盘。
- 对所有的故障盘分别执行下电和上电操作。
登录CLI管理界面,并进入developer模式,运行poweroff disk和poweron disk命令。
engineer:/>poweroff disk disk_id=DAE000.0 DANGER: You are about to power off the disk. This operation causes the disk to be unreadable and unwritable for services. If the disk domain where the disk resides is in the reconstruction or degradation state, this operation may cause reconstruction failure, service interruption, and data loss. Suggestion: Before performing this operation, check the disk properties and status of the disk domain that houses the disk to avoid reconstruction failure, service interruption and data loss. Back up data before powering off. Have you read danger alert message carefully?(y/n)y Are you sure you really want to perform the operation?(y/n)y Command executed successfully. engineer:/>poweron disk disk_id=DAE000.0 Command executed successfully.
如果故障盘是非成员盘,下电后由于盘对象被释放,执行上电命令将会失败。
- 所有故障盘上电成功后,检查健康状态。
登录CLI管理界面,运行show disk_domain general命令查看状态。
admin:/>show disk_domain general ID Name Health Status Running Status Total Capacity Free Capacity Hot Spare Capacity Used Hot Spare Capacity -- ---- ------------- -------------- -------------- ------------- ------------------ ----------------------- 0 d0 Normal Online 4.055TB 556.242GB 524.312GB 0.000B
- 当“Health Status”显示“Normal”或“Degraded”时,说明业务正在恢复。
- 当“Health Status”显示为其他值时,说明业务未恢复。请联系华为研发工程师解决。