所选语种没有对应资源,请选择:

本站点使用Cookies,继续浏览表示您同意我们使用Cookies。Cookies和隐私政策>

提示

尊敬的用户,您的IE浏览器版本过低,为获取更好的浏览体验,请升级您的IE浏览器。

升级

FusionCloud 6.3.1 故障处理 06

评分并提供意见反馈 :
华为采用机器翻译与人工审校相结合的方式将此文档翻译成不同语言,希望能帮助您更容易理解此文档的内容。 请注意:即使是最好的机器翻译,其准确度也不及专业翻译人员的水平。 华为对于翻译的准确性不承担任何责任,并建议您参考英文文档(已提供链接)。
etcd-backup服务所在节点下电或重启导致etcd备份恢复失败(租户管理域)

etcd-backup服务所在节点下电或重启导致etcd备份恢复失败(租户管理域)

现象描述

租户管理域etcd备份成功,在etcd集群其中一个节点下电过程中进行恢复操作,可能导致etcd恢复失败和etcd的pod异常。重新上电节点后,手动修复etcd,再次执行etcd恢复操作时仍然失败。

可能原因

下电节点刚好是etcd-backup服务所在的节点,etcd恢复过程中,etcd-backup调度到其他节点上了,导致恢复失败。

处理方法

  1. 使用PuTTY,登录om_core1_ip节点。

    默认帐号:paas,默认密码:QAZ2wsx@123!

  2. 执行如下命令,查询etcd对应的statefulset,查看etcd启动个数是否正常。

    kubectl get statefulset -nmanage | grep etcd

    etcd                    3             3                        2d
    etcd-event              3             3                        2d
    etcd-network            3             0                        2d

    回显中第二列指期望启动pod个数,第三列指实际启动pod个数,对比是否相同。

    • 是,服务正常。
    • 否,执行步骤 3

  3. 修改etcd-network的statefulset的enable值,把false改为true,下面以etcd-network pod失败为例。

    kubectl edit statefulset -nmanage etcd-network

  4. 执行如下命令,查看etcd-network pod的状态是否为Running。

    kubectl get pod -nmanage -owide | grep etcd

    待所有pod状态都为Running,继续执行以下操作。

  5. 执行如下命令,获取etcd-network pod所在节点。

    kubectl get pod -nom -owide | grep etcd

  6. 登录步骤 5中获取的节点,执行如下命令切换为root用户。

    su - root

  7. 执行如下命令,查询etcd pod的容器id。

    docker ps | grep etcd

  8. 进入etcd pod容器后,获取/registry/configmaps/manage/etcd.download.info的value值。

    docker exec -ti containerID bash

    ETCDCTL_API=3 /start-etcd --cacert /srv/kubernetes/ca.crt --cert

    /srv/kubernetes/server.cer --key

    /srv/kubernetes/server_key.pem --endpoints

    https://127.0.0.1:4001 get /registry/configmaps/manage/etcd.download.info

    containerID是步骤 7中获取的容器ID。

    {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"etcd.download.info","namespace":"manage","selfLink":"/api/v1/namespaces/manage/configmaps/etcd.download.info","uid":"15b788c6-d9a1-11e8-8f76-286ed488c964","creationTimestamp":"2018-10-27T04:30:51Z","labels":{"app":"etcd-backup-1569e6e2-d9a","appgroup":"etcd-manage_default-appGroup","stack-name":"etcd-manage"},"enable":true},"data":{"etcd":"{\"etcdtype\":\"etcd\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd/tar/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}","etcd-network":"{\"etcdtype\":\"etcd-network\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd-network/tar/etcd-network_2018-10-29-11-20-21_manage_825eaa88f2a53550895dfb962c653436.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd-network_2018-10-29-11-20-21_manage_825eaa88f2a53550895dfb962c653436.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}"}}

  9. etcd、etcd-event、etcd-network中哪个恢复失败,就删除key中对应的data数据。

    例如,etcd-network pod恢复失败,就将etcd-network pod的数据都删除。

    {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"etcd.download.info","namespace":"manage","selfLink":"/api/v1/namespaces/manage/configmaps/etcd.download.info","uid":"15b788c6-d9a1-11e8-8f76-286ed488c964","creationTimestamp":"2018-10-27T04:30:51Z","labels":{"app":"etcd-backup-1569e6e2-d9a","appgroup":"etcd-manage_default-appGroup","stack-name":"etcd-manage"},"enable":true},"data":{"etcd":"{\"etcdtype\":\"etcd\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd/tar/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}","etcd-network":"{\"etcdtype\":\"etcd-network\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd-network/tar/etcd-network_2018-10-29-11-20-21_manage_825eaa88f2a53550895dfb962c653436.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd-network_2018-10-29-11-20-21_manage_825eaa88f2a53550895dfb962c653436.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}"}}

  10. 把删除后的数据重新写入。

    ETCDCTL_API=3 /start-etcd --cacert /srv/kubernetes/ca.crt --cert /srv/kubernetes/server.cer --key /srv/kubernetes/server_key.pem --endpoints https://127.0.0.1:4001 put /registry/configmaps/manage/etcd.download.info '{"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"etcd.download.info","namespace":"manage","selfLink":"/api/v1/namespaces/manage/configmaps/etcd.download.info","uid":"15b788c6-d9a1-11e8-8f76-286ed488c964","creationTimestamp":"2018-10-27T04:30:51Z","labels":{"app":"etcd-backup-1569e6e2-d9a","appgroup":"etcd-manage_default-appGroup","stack-name":"etcd-manage"},"enable":true},"data":{"etcd":"{\"etcdtype\":\"etcd\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd/tar/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}"}}'
    说明:

    {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"etcd.download.info","namespace":"manage","selfLink":"/api/v1/namespaces/manage/configmaps/etcd.download.info","uid":"15b788c6-d9a1-11e8-8f76-286ed488c964","creationTimestamp":"2018-10-27T04:30:51Z","labels":{"app":"etcd-backup-1569e6e2-d9a","appgroup":"etcd-manage_default-appGroup","stack-name":"etcd-manage"},"enable":true},"data":{"etcd":"{\"etcdtype\":\"etcd\",\"downloadurl\":\"https://etcd-backup.manage.svc.cluster.local:30436/pbs/v2/manage/etcd/tar/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"bakdatainfo\":\"/ftpboot/etcd_backup/etcd_2018-10-29-11-20-20_manage_e025db26f9bd8c747a6db15f8f796386.tar.gz\",\"backupway\":\"sftp\",\"encryptionflag\":true,\"remoteaddress\":{\"Id\":1,\"ip\":\"9.91.0.80\",\"name\":\"9.91.0.45\",\"password\":\"QAZ2wsx@123!\",\"path\":\"/ftpboot\",\"port\":22,\"type\":\"sftp\",\"user\":\"sftpuser\"}}"}}为删除data中恢复失败集群后的key值。

    系统回显如下内容。

    OK

  11. 执行如下命令,退出容器。

    exit

  12. 在运维管理面再次执行恢复操作,如果恢复成功,则说明故障恢复。
翻译
下载文档
更新时间:2019-08-19

文档编号:EDOC1100043088

浏览量:22495

下载量:453

平均得分:
本文档适用于这些产品

相关版本

相关文档

Share
上一页 下一页