No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

The intermittent disconnection of S2600 disk

Publication Date:  2012-10-18 Views:  45 Downloads:  0
Issue Description
The S2600 in a site has disk fault, we need to find the reason.
Alarm Information
Collecting log by tools, from alarm information we can see these:
Alarm Name:log; Alarm Level:Event; Cleared Time:-- --; Recovered Time:-- --; Cleared User:-- --; Cause:none; Suggestion:none; Description:Disk( 0,10) in on Controller B
Alarm Name:log; Alarm Level:Event; Cleared Time:-- --; Recovered Time:-- --; Cleared User:-- --; Cause:none; Suggestion:none; Description:Disk( 0,10) out on Controller B
Handling Process
AK-I kernel: [18842412991][ParseSasEvent:8416]:SasDevice: 5000c50039bb25a9 BusId: 0 ScsiId: 10 RESET!
After array analysis, hard disk (0, 10) is always sending RESET signal during running, it leads intermittent disconnection between hard disk and back-end chip of disk. Change this hard disk.
Root Cause
1. Examine the physical state of this disk:
showdisk -s 0 -sl 10
    =========================================
                Disk Information
    -----------------------------------------
     Disk Location            |  (0, 10)
     Type                     |  --
     Associated Disk          |  --
     Reconstruction Progress  |  --
     RAID Group ID            |  --
     Logical Status           |  --
     Usable Capacity(GB)      |  --
     Physical Status          |  Isolation
     Master Path              |  Offline
     Slave Path               |  Offline
     Temperature(?)          |  27
     Speed(RPM)               |  --
     World Wide Name          |  --
     Vendor                   |  SEAGATE
     Model                    |  ST3300657SS
     Firmware Number          |  0008
     Serial Number            |  --
     Physical Type            |  SAS
     Current Speed(Gbps)      |  --
     Raw Capacity(GB)         |  --
    =========================================
2. Analyze message log, we can see:
1) The inside of hard disk intermittent disconnect, print these log:
AK-I kernel: [18842412991]mptbase: ioc0: WARNING - MPT event:(0Fh) : SAS Device Status Change: Internal Device Reset: ha0 channel=0 id=10
10 is this hard disk’s SCSI ID, when we appoint we should look if the SCSI ID is always same. If it is, the hard disk has error-bit, change it; if it isn’t, there may be error-bit in cascading interface or cascading cable.
2) Hard disk arrange intermittent disconnect, the key word is disconnected (Rate Unknown); arrange pass the key word (Rate 3.0)
AK-I kernel: [18842412939]mptbase: ioc0: WARNING - MPT event:(12h) : SAS PHY Link Status: Phy=10: Rate Unknown
AK-I kernel: [18842396939]mptbase: ioc0: WARNING - MPT event:(12h) : SAS PHY Link Status: Phy=10: Rate 3.0 Gpbs
Examine the ohy=10, if the log always say the disk of phy=10 intermittent disconnect, this hard disk has some problem, change it. If many phy intermittent disconnect, and print all the time , please contact to the R&D engineers. (It’s normal intermittent disconnect when upgrading SES, but it can’t last).
Suggestions
na

END