No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

RH1288 V3 server freezes at night while doing backup

Publication Date:  2017-11-22 Views:  64 Downloads:  3
Issue Description

Almost each night when backup was going server freezes on nearly 5 minutes and did not respond, errors were seen on backup server, and alarms. After this freeze server was recovering without any interruption from engineer. Main problem was, that on this 5 minutes all VMs were fully unavailable.

Server software Microsoft windows 2012 r2

Backup software Microsoft DPM 


 

Alarm Information

Such windows alarms (no ibmc alarms were generated):

 

1)   
Input-output operation of logical block for disk 1
Returning to device\device\raidport2                         
Driver had error with controller \device\raidport2  

Error in raid controller logs:

Controller encountered a fatal error and was reset

Handling Process

At first we analyzed logs and saw a lot of events about raid controller restarts:

1) Controller encountered a fatal error and was reset was seen in dump_info\LogDump\LSI_RAID_Controller_Log - collected via 1-click info collection

2) Controller encountered a fatal error and was reset was seen in raid\sasraidlog.txt  -  collected via info collect tool

We replaced raid controller, error was seen not so often, but still it was.

We asked customer to test with different VM count on hosts. When there were 5-6 VM server did not have such problem, when 8 - problem was.

Also as customer had 3-4 such servers wih  such software we asked to provide logs from "good" server, we analyzed them also to compare with problem server.

After the detailed analysis from R & D they confirmed it was firmware problem, raid controller was unable to process so huge amount of data.

 

 

Root Cause

Root cause was old firmware which was unable to process all data during backup and raid controller was restarting. Firmware upgrade to the latest version solved the problem.

Solution

Solution is to upgrade the firmware of raid card to the latest version(before upgrading better to create full server backup). We used firmware 4.660 to solve this issue for raid card 3108.

You will need this tool http://support.huawei.com/enterprise/en/software/22400692-SW1000258493 (or newer version)

Guide is here http://support.huawei.com/enterprise/en/doc/DOC1000061915?idPath=7919749%7C9856522%7C9856629%7C21015513

 

Suggestions

Suggestion is to upgrade firmware before raid card replacement.

END