2台S12708集群后网络丢包问题

发布时间:  2014-10-31 浏览次数:  300 下载次数:  0
问题描述
1、 三层结构组网:PC—接入交换机—汇聚交换机—核心交换机—服务器;
2、 PC上ping服务器在核心交换机S12708上有丢包现象;


告警信息
处理过程
1、 在接入交换机S3700上下行接口、汇聚交换机S9303上下行接口、核心交换机S12708上下行接口分别启用流量统计;
2、 开始在PC上ping服务器并查看丢包情况(发现有丢包即可停止ping);
3、 发现在S12708跟S9303互联接口的out方向有丢包(in方向收到5006个数据包,但out方向只发了5004个数据包);

<S12708-A>dis traffic policy statistics interface g2/2/0/1 i

Interface: GigabitEthernet2/2/0/1
Traffic policy inbound: pp
Rule number: 2
Current status: OK!
Statistics interval: 300
---------------------------------------------------------------------
Board : 2/2
---------------------------------------------------------------------
Matched          |      Packets:                         5,006
                  |      Bytes:                         410,492
                  |      Rate(pps):                           0
                  |      Rate(bps):                           0
---------------------------------------------------------------------
   Passed         |      Packets:                         5,006
                  |      Bytes:                         410,492
                  |      Rate(pps):                           0
                  |      Rate(bps):                           0
---------------------------------------------------------------------

<S12708-A>dis traffic policy statistics interface g2/2/0/1 o

Interface: GigabitEthernet2/2/0/1
Traffic policy outbound: pp
Rule number: 2
Current status: OK!
Statistics interval: 300
---------------------------------------------------------------------
Board : 2/2
---------------------------------------------------------------------
Matched          |      Packets:                         5,004
                  |      Bytes:                         410,328
                  |      Rate(pps):                           0
                  |      Rate(bps):                           0
---------------------------------------------------------------------
   Passed         |      Packets:                         5,004
                  |      Bytes:                         410,328
                  |      Rate(pps):                           0
                  |      Rate(bps):                           0
---------------------------------------------------------------------

4、 在S12708上,对应的丢包时间段发现收到很多STP的TC报文;

  ===============display stp tc-bpdu statistics===============
==================================================================
--------------------STP TC/TCN information----------------------
MSTID Port                  TC(Send/Receive)      TCN(Send/Receive)
0     GigabitEthernet2/2/0/1      51/2                  0/0
0     GigabitEthernet2/2/0/2      40/2                  0/0
0     GigabitEthernet2/2/0/3      40/2                  0/0
0     GigabitEthernet2/2/0/4      40/2                  0/0
0     GigabitEthernet2/2/0/6      31/2                  0/0
0     GigabitEthernet2/2/0/7      51/3                  0/0
0     GigabitEthernet2/2/0/8      56/3                  0/0
0     GigabitEthernet2/2/0/9      50/21                 0/0
0     GigabitEthernet2/2/0/10     54/21                 0/0

Oct 18 2014 14:22:52 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101734]:MSTP received BPDU with TC, MSTP process 0 instance 0, port name is GigabitEthernet2/2/0/10.
Oct 18 2014 14:22:54 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101738]:MSTP received BPDU with TC, MSTP process 0 instance 0, port name is GigabitEthernet2/2/0/9.
Oct 18 2014 14:22:56 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101743]:MSTP received BPDU with TC, MSTP process 0 instance 0, port name is GigabitEthernet2/2/0/9.
Oct 18 2014 14:26:53 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101991]:MSTP received BPDU with TC, MSTP process 0 instance 0, port name is GigabitEthernet2/2/0/9.
Oct 18 2014 14:26:55 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101994]:MSTP received BPDU with TC, MSTP process 0 instance 0, port name is GigabitEthernet2/2/0/9.
Oct 18 2014 14:26:58 S12708-A %%01MSTP/6/RECEIVE_MSTITC(l)[101999]:MSTP received BPDU


根因
交换机S12708收到TC报文后会启动所有使能STP接口的arp重新学习机制,当arp得不到及时回应时则会发生丢包;
解决方案
因为S12708和S9303之间的4个互联口G1/2/0/1、G1/2/0/2、G 2/2/0/1 、G2/2/0/2上透传的vlan都不相同,即这几个互联口业务上是不存在环路的,因此可以把这几个接口STP去使能;STP去使能后接口不再运行STP,也就不会受到STP收敛的影响。
如果接口必须要使能STP的话,可排查STP的TC来源,排查STP的TC报文产生的时间点是否有STP拓扑变更情况。
建议与总结
1、 遇到网络丢包时,首先要确定具体丢包位置(可使用分段流量统计方法);
2、 确定丢包设备后再分析设备当时运行状态及各协议运行状态有无异常;

END