某公司MCU重启导致会议中断,会议无法调度问题

发布时间:  2016-08-27 浏览次数:  105 下载次数:  0
问题描述

客户反馈会议过程中会场全部离会,结束会议后重新召集入会,但是后面又出现全部会场离会,重新召集提示MCU分配资源失败。

处理过程

1.       分析RM日志看会议中断原因

根据RM的日志分析是“MCU重启离线导致会议中断”,在离线期间召集会议提示分配资源失败

 

RM异常记录

851 MCU 离线重启

934 MCU 离线

936MCU重启后上线,然后 MCU离线

 

 

2.       确认RMMCU的时间差:

根据“A会议”召开的时间看RM上的   8:19等于MCU10:13,相差1小时54分(可以多看几个关联时间来确认)

RM日志

MCU O类日志(操作类日志)

打开最近时间的MCU O类日志(MCU的各类日志文件以记录的第一条的时间命名,达到记录上限后创建下一个日志文件)

 

RM异常记录

对应MCU时间

851 MCU 离线重启

1045

934 MCU 离线

1128

936MCU重启后上线然后 MCU离线

1130

 

3.       查看MCU各类日志,分析问题时间段MCU的状态

查看最近时间的MCU A类(告警类)日志:

 

对应MCU离线的时间记录了MCU的温度过高,超过阈值导致MCU重启,会场离会,会议调度不起来

 

 [Security][Notice] 2016-08-18 10:40:16 main 51766 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [3] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:40:20 main 51841 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [3] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:40:30 main 52412 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [3] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:42:06 main 55416 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [1] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:42:24 main 56011 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [1] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:42:38 main 56390 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [1] temperature 76 has exception!

[Security][Notice] 2016-08-18 10:45:30 main 61692 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

[Security][Notice] 2016-08-18 10:45:31 main 61705 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

 [Security][Notice] 2016-08-18 11:25:33 main 15668 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [1] temperature 76 has exception!

[Security][Notice] 2016-08-18 11:25:39 main 15726 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose]the board [1] temperature 76 has exception!

[Security][Notice] 2016-08-18 11:28:15 main 19966 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

[Security][Notice] 2016-08-18 11:28:16 main 19974 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

 [Security][Notice] 2016-08-18 11:30:29 main 56553 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

[Security][Notice] 2016-08-18 11:30:30 main 56608 Null [EXPARA:14<CPUID=0x1000>]string:[main][diagnose][reboot]reboot subboard 2 when temp > 80

根因

1. 检查现场防尘网已经积灰较多,导致散热性能降低,开会过程中温度升高达到阈值,MCU重启,导致会议中断

 

2. 离线期间召集会议,提示MCU分配资源失败

解决方案


END