How to recover the OS when LSI 1078 Raid5 lost two disks

Publication Date:  2014-12-16 Views:  1157 Downloads:  5
Issue Description
Server Model: RH2288 V2     Raid Card: LSI 1708

The customer reported two disks failed in the server,and the OS can't bootup,the customer had critical data and service running on the server.
Alarm Information
1: We can see two disks status are faulty as the indicator color and the alarms in the imana.

Handling Process
1: Login the Raid BIOS console and feedback the logical status :

2: Check the physical view in Raid console,we can find the status of slot1 disk is "Unconfigured Bad,Not Responding",and the status of slot3 disk is "FOREIGN,Unconfigured Bad".
Root Cause
The status of "Unconfigured Bad,Not Responding" means the disk is totally physical broken,can't be recoved.

The status of "FOREIGN,Unconfigured Bad" means the disk is failing,maybe many bad sectors or some other problems,but not totally failed. In this case,still we can try to recover it.
Solution
1:In Raid BIOS console,click the “Unconfigured Bad” disk to enter the properties page;
2:Click “Make UnconfGood”->“GO”;
3:After disk status change to “Unconfigured Good“, go to home page;
4:Click “Controller Selection“;
5: Press “start”,then press “preview”.
6:Choose the Drive Group, then press “Import”.
7: Return to home page, then you will find the configuration already imported,and the stuats of virtual drives chang to "degraded";
8: Click “Virtual Drives“;
9: Choose“Set Boot Drive(current=NONE)“->”GO”;
10: Confirm “Set Boot Drive(current=0)”;
11: Go back home page;
12: Click“Exit“-“Yes” to exit the Raid BIOS console, then press ”Ctrl+Alt+Delete” to reboot the server;
13: Enter the OS,backup the data to another server,then replace the faulty disks.
Suggestions
For this case, the slot3 disk failed 15 days after the slot1 disk.
But the customer didn't monitor the server very well until the second disk failed and the OS can't login.
After the data backup,replace the disk in slot3,then replace the disk in slot1.

If there is any disk failed in a redundant Raid group,,we sugest to replace the faulty disk ASAP,not wait to another disk failed and the raid offline.
Although sometimes if  the disk stauts is"FOREIGN,Unconfigured Bad" ,we can try to recover the data,but we need to replace this disk also,because the disk is failing,and will be broken soon.

END