1, The issue is regarding the server and OS, we collected the hardware log and OS log with fusionserver info-collect tool.
2, From the hardware log, sel file we can see there are several restart message, and there is no hardware faulty.
3, According to above information, the restart of the server is from the operating system side. So we need check the OS log.
4, The OS is centos 6.8, and when OS crash, the error message will be in the crash dump file. The name of the dump file is vmcore-dmesg.txt
5, In this dmesg file, we found below information:
<1>BUG: unable to handle kernel paging request at ffff882111afa00c
<1>IP: [<ffffffffa02598b2>] ixgbe_xmit_frame_ring+0x2d2/0xea0 [ixgbe]
<4>PGD 1a8e063 PUD 0
6, According to error message above, when the system crash (unable to handle kernal paging),the error pointed to the network card (ixgbe)
7,Because the issue happens after upgrading the OS to centos 6.8 and the issue is related to the network card, the root cause pointed to the driver of the network card in the new OS.
After upgrading the OS to centos 6.8, the effective network card driver will be from the OS. However, for the network card, it is recommended to use the driver from the hardware vendor, because the driver from OS have a lot of issue.
From the SN of the server, we are able to find BOM code of the involved network card is 06310023. However, for this network card, there is no driver available from Huawei website for centos 6.8
In this case ,it is suggested to download the latest driver from intel website.