No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Alarm "Disk device not found" generated in S5500T file engine

Publication Date:  2014-07-31 Views:  65 Downloads:  0
Issue Description
 customer login the file engine with ISM, and found the alarm "disk device not found on the storage unit" as shown in below picture, no business is impacted.

Alarm Information
Disk device not found on the storage unit。


Handling Process
1, According to the error message, we need check the status of the disk in file engine first.

login file engine with support account and issue below command "vxdisk list -o alldgs" and we received below output:

RCShwengine_02:~ # vxdisk list -o alldgs
DEVICE       TYPE            DISK         GROUP        STATUS
huawei-s5500t0_30 auto:simple     huawei-s5500t0_30  sfsdg        online shared
huawei-s5500t0_31 auto:simple     huawei-s5500t0_33  sfscoorddg   online
huawei-s5500t0_32 auto:simple     huawei-s5500t0_32  sfscoorddg   online
huawei-s5500t0_33 auto:simple     huawei-s5500t0_31  sfsdg        online shared
huawei-s5500t0_34 auto:simple     huawei-s5500t0_34  sfscoorddg   online
huawei-s5500t0_45 auto:simple     huawei-s5500t0_46  sfsdg        online shared
huawei-s5500t0_46 auto:simple     huawei-s5500t0_48  sfsdg        online shared
huawei-s5500t0_47 auto:simple     huawei-s5500t0_47  sfsdg        online shared
huawei-s5500t0_48 auto:simple     huawei-s5500t0_45  sfsdg        online shared
huawei-s5500t0_51 auto:simple     huawei-s5500t0_51  sfsdg        online shared
huawei-s5500t0_52 auto:simple     huawei-s5500t0_52  sfsdg        online shared
-            -         huawei-s5500t0_50 sfsdg        failed was:huawei-s5500t0_50
RCShwengine_02:~ #

2, From above output ,we can see there is one failed disk, there should be something wrong with this disk.

3, Then we need check is there any file system in this disk with command "vxprint":

RCShwengine_02:~ # vxprint
Disk group: sfsdg

TY NAME         ASSOC        KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0
dg sfsdg        sfsdg        -        -        -        -        -       -

dm huawei-s5500t0_30 huawei-s5500t0_30 - 167604544 -    -        -       -
dm huawei-s5500t0_31 huawei-s5500t0_33 - 167604544 -    -        -       -
dm huawei-s5500t0_45 huawei-s5500t0_48 - 12884836000 -  -        -       -
dm huawei-s5500t0_46 huawei-s5500t0_45 - 32212188832 -  -        -       -
dm huawei-s5500t0_47 huawei-s5500t0_47 - 32212188832 -  -        -       -
dm huawei-s5500t0_48 huawei-s5500t0_46 - 12884836000 -  -        -       -
dm huawei-s5500t0_50 -       -        -        -        NODEVICE -       -
dm huawei-s5500t0_51 huawei-s5500t0_51 - 209534192 -    -        -       -
dm huawei-s5500t0_52 huawei-s5500t0_52 - 209534192 -    -        -       -

v  _nlm_        fsgen        ENABLED  204800   -        ACTIVE   -       -
pl _nlm_-01     _nlm_        ENABLED  204800   -        ACTIVE   -       -
sd huawei-s5500t0_30-02 _nlm_-01 ENABLED 204800 0       -        -       -
pl _nlm_-02     _nlm_        ENABLED  204800   -        ACTIVE   -       -
sd huawei-s5500t0_31-02 _nlm_-02 ENABLED 204800 0       -        -       -

vt DDBORA       -            ENABLED  -        -        ACTIVE   -       -
v  DDBORA_tier1 DDBORA       DISABLED 62668800 -        ACTIVE   -       -
pl DDBORA_tier1-01 DDBORA_tier1 DISABLED 62668800 -     NODEVICE -       -
sd huawei-s5500t0_50-04 DDBORA_tier1-01 DISABLED 62668800 0 NODEVICE -   -

vt DDBSAPESDB   -            ENABLED  -        -        ACTIVE   -       -
v  DDBSAPESDB_tier1 DDBSAPESDB DISABLED 251658240 -     ACTIVE   -       -
pl DDBSAPESDB_tier1-01 DDBSAPESDB_tier1 DISABLED 251658240 - NODEVICE -  -
sd huawei-s5500t0_50-01 DDBSAPESDB_tier1-01 DISABLED 251658240 0 NODEVICE - -

vt DDBSAPESFS   -            ENABLED  -        -        ACTIVE   -       -
v  DDBSAPESFS_tier1 DDBSAPESFS DISABLED 901775360 -     ACTIVE   -       -
pl DDBSAPESFS_tier1-01 DDBSAPESFS_tier1 DISABLED 901775360 - NODEVICE -  -
sd huawei-s5500t0_50-02 DDBSAPESFS_tier1-01 DISABLED 901775360 0 NODEVICE - -

vt DDBSAPITDB   -            ENABLED  -        -        ACTIVE   -       -
v  DDBSAPITDB_tier1 DDBSAPITDB DISABLED 335544320 -     ACTIVE   -       -
pl DDBSAPITDB_tier1-01 DDBSAPITDB_tier1 DISABLED 335544320 - NODEVICE -  -
sd huawei-s5500t0_50-03 DDBSAPITDB_tier1-01 DISABLED 335544320 0 NODEVICE - -


4, According to above output ,we can see there are four file system in this disk ,and the status is abnormal, no device.

5, Then we can login the file engine with ISM, and also found below four file system offline status.


6, confirmed from customer ,there four file system was used for testing and they don't use them any more.

7,  Delete those four file system from ISM ,and then issue below command under support account to delete the useless information

vxdg rmdisk huawei-s5500t0_50

8, alarm disappear and problem resolved. 

Root Cause
According to the alarm message, file engine detected that there is something wrong with one disk(disk in file engine means LUN).
There are two possible reason:

1,There is something wrong with the LUN;
2,There are something wrong with the link between the storage array and file engine.
Suggestions
1,The root cause of the problem is when customer don't want to use the disk, they didn't unmap it from the file engine, and instead, they login the backend  storage array with ISM and unmapped the disk and then deleted the LUN from storage array directly . We can find the record from the log of the storage array.
2014-07-04 11:47:24 DST    0x200f01000020    Infor    None    admin:10.218.28.12 succeeded in removing the LUN (9) from the LUN group (ID 0).
2014-07-04 11:47:49 DST    0x200f000b0037    Infor    None    admin:10.218.28.12 succeeded in deleting the LUN (ID 9, name DDB_fileengine).

2, we need confirm that all the file system is useless and then we can delete the file system . then use the command "vxdg rmdisk huawei-s5500t0_50" to delete the useless information from file engine. (huawei-s5500t0_50 is the disk name, output of command "vxdisk list -o alldgs" whose status is "failed")

3,The problem may occur in every version of N8300, N8500 and file engine. However, if we discover the file engine and storage array at the same time, this problem will not happen. 

END