文件服务器双机资源无法上线

发布时间:  2015-03-19 浏览次数:  2027 下载次数:  11
问题描述
现场安装完VCS和文件服务器之后启动双机失败,除了浮动IP外其他的资源都无法上线,现场反馈都是按照文档操作的
处理过程

让客户在双机中其中一台执行./Linux_VCS_info_collect.sh 脚本(需要上传详见产品文档)将收集到的日志发送回来

1. 查看日志包中的VCS_engine_A.txt的日志有如下打印,分析结果资源上线失败的原因是磁盘挂载失败导致的

2015/03/16 11:55:42 VCS INFO V-16-2-13716 (FS1) Resource(mount): Output of the completed operation (online)
==============================================
mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg_fs-lv_fs,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg_fs-lv_fs,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

2. 查看操作系统日志messages.txt发现挂载磁盘的时候有错误日志打印,继续查看磁盘的挂载信息日志fdisk.txt

Mar 16 15:11:02 FS1 kernel: [   54.987133] [227][2015-03-16 15:11:02:817542][00005342088002de][ERROR][PIA][PIA_TransReqComplete][734]upcmd ID 0x109 from VHBA fail, result = 0x8000002!,SK 0x5 ASC/ASCQ 0x20/0
Mar 16 15:11:02 FS1 kernel: [   54.989299] [228][2015-03-16 15:11:02:819710][00005342088002de][ERROR][PIA][PIA_TransReqComplete][734]upcmd ID 0x113 from VHBA fail, result = 0x8000002!,SK 0x5 ASC/ASCQ 0x20/0
Mar 16 15:11:02 FS1 kernel: [   54.989322] [229][2015-03-16 15:11:02:819731][000053420662043c][ERROR][VDS][VDS_HSParseIllegalReq][1084]Cmd sensekey is "illegal request", device is command device, action is "pass to upper layer".sense length {96},sense key {0x05},asc/ascq {0x20/00},op {0x4d},cmd {113},(disk (3),tpg {1},path {1})
Mar 16 15:11:02 FS1 kernel: [   54.989331] [230][2015-03-16 15:11:02:819743][00005342053104c6][ERROR][LPM][LPM_CmdActCmdError][1222]cmd fail, because XMP_ACT_CMD_ERROR, pass to upper layer.

 

3. 从fdisk.txt日志中发现一个LV的大小为18T超过了ext3格式的文件系统支持的最大容量,因此挂载失败

#==[ Command ]======================================#
# lvdisplay
  --- Logical volume ---
  LV Name                /dev/vg_fs/lv_fs
  VG Name                vg_fs
  LV UUID                sI1JC1-JjNv-EId6-FCKu-HvXl-b2b0-sHlKWw
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                18.07 TB
  Current LE             148068
  Segments               4
  Allocation             inherit
  Read ahead sectors     auto 

根因
单个LV挂载的大小超过了18T超过了操作系统的ext3文件系统管理的最大容量导致mount失败
建议与总结

VCS问题的定位入口首先是VCS_engine_A.txt,从这个日志中可以大致看出资源不上线的原因,如果和操作系统相关看messages.txt日志,其他的需要结合业务实际来单个分析

END