SUSE11SP3内核对未知类型报文统计到网卡drop

发布时间:  2015-10-31 浏览次数:  1172 下载次数:  0
问题描述

机架服务器RH2288 v2tg3网卡,suse11sp1(内核:2.6.32)系统下没有网卡丢包,但升级到suse11sp3(内核:3.0.101)后,网卡一直有丢包,每秒中增加几十个,如下所示。

eth3      Link encap:Ethernet  HWaddr A4:DC:BE:F6:C7:F7 

          inet addr:192.168.2.81  Bcast:192.168.255.255  Mask:255.255.0.0

          inet6 addr: fe80::a6dc:beff:fef6:c7f7/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:88598 errors:0 dropped:7387 overruns:0 frame:0

          TX packets:373 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:6721091 (6.4 Mb)  TX bytes:36204 (35.3 Kb)

          Interrupt:32

l  网卡驱动版本;

linux-2288V3-81:/tmp/tg3 # ethtool -i eth3
driver: tg3
version: 3.137j
firmware-version: 5719-v1.31 NCSI v1.2.12.0

告警信息

eth3      Link encap:Ethernet  HWaddr A4:DC:BE:F6:C7:F7 

          inet addr:192.168.2.81  Bcast:192.168.255.255  Mask:255.255.0.0

          inet6 addr: fe80::a6dc:beff:fef6:c7f7/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:88598 errors:0 dropped:7387 overruns:0 frame:0

          TX packets:373 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:6721091 (6.4 Mb)  TX bytes:36204 (35.3 Kb)

          Interrupt:32

处理过程

l  首先通过ethtool –S分析网卡寄存器的详细统计:

NIC statistics:

     rx_octets: 6711137

     rx_ucast_packets: 434

     rx_mcast_packets: 5335

     rx_bcast_packets: 82679

     tx_octets: 36204

     tx_ucast_packets: 364

     tx_mcast_packets: 6

     tx_bcast_packets: 3

初步分析,没有任何一项,或者几项相加等于ifconfig统计的drop值。因此,这个丢包就可能不是网卡自身丢包,而是协议栈丢包。因为,如果是网卡自身丢包,一般会通过ethtool –S读到。

参考下述帖子,使用systemtap工具进一步分析:

http://www.it165.net/os/html/201308/5944.html

2.6.32内核代码,netif_receive_skb函数中,当出现不支持的protocol的时候,内核只是简单的drop,并不会增加dev->stat.dropped字段,但在3.0内核中,将这种数据包也统计 dev->stat.dropped中。

简单的说:3.0内核上,内核针对支持的协议都会添加到一个链表里,接收到数据包后内核会取协议字段进行比较,如果协议字段的值跟链表里存的值不匹配就丢弃并添加一个drop统计。3.0以前也会丢弃,但是没有加drop统计。

3.0内核目前定义协议类型如下,不在下面的很可能就不支持,统计到drop中。
#define ETH_P_LOOP 0x0060 /* Ethernet Loopback packet */
#define ETH_P_PUP 0x0200 /* Xerox PUP packet */
#define ETH_P_PUPAT 0x0201 /* Xerox PUP Addr Trans packet */
#define ETH_P_IP 0x0800 /* Internet Protocol packet */
#define ETH_P_X25 0x0805 /* CCITT X.25 */
#define ETH_P_ARP 0x0806 /* Address Resolution packet */
#define ETH_P_BPQ 0x08FF /* G8BPQ AX.25 Ethernet Packet [ NOT AN OFFICIALLY REGISTERED ID ] */
#define ETH_P_IEEEPUP 0x0a00 /* Xerox IEEE802.3 PUP packet */
#define ETH_P_IEEEPUPAT 0x0a01 /* Xerox IEEE802.3 PUP Addr Trans packet */
#define ETH_P_DEC 0x6000 /* DEC Assigned proto */
#define ETH_P_DNA_DL 0x6001 /* DEC DNA Dump/Load */
#define ETH_P_DNA_RC 0x6002 /* DEC DNA Remote Console */
#define ETH_P_DNA_RT 0x6003 /* DEC DNA Routing */
#define ETH_P_LAT 0x6004 /* DEC LAT */
#define ETH_P_DIAG 0x6005 /* DEC Diagnostics */
#define ETH_P_CUST 0x6006 /* DEC Customer use */
#define ETH_P_SCA 0x6007 /* DEC Systems Comms Arch */
#define ETH_P_TEB 0x6558 /* Trans Ether Bridging */
#define ETH_P_RARP 0x8035 /* Reverse Addr Res packet */
#define ETH_P_ATALK 0x809B /* Appletalk DDP */
#define ETH_P_AARP 0x80F3 /* Appletalk AARP */
#define ETH_P_8021Q 0x8100 /* 802.1Q VLAN Extended Header */
#define ETH_P_IPX 0x8137 /* IPX over DIX */
#define ETH_P_IPV6 0x86DD /* IPv6 over bluebook */
#define ETH_P_PAUSE 0x8808 /* IEEE Pause frames. See 802.3 31B */
#define ETH_P_SLOW 0x8809 /* Slow Protocol. See 802.3ad 43B */
#define ETH_P_WCCP 0x883E /* Web-cache coordination protocol
* defined in draft-wilson-wrec-wccp-v2-00.txt */
#define ETH_P_PPP_DISC 0x8863 /* PPPoE discovery messages */
#define ETH_P_PPP_SES 0x8864 /* PPPoE session messages */
#define ETH_P_MPLS_UC 0x8847 /* MPLS Unicast traffic */
#define ETH_P_MPLS_MC 0x8848 /* MPLS Multicast traffic */
#define ETH_P_ATMMPOA 0x884c /* MultiProtocol Over ATM */
#define ETH_P_LINK_CTL 0x886c /* HPNA, wlan link local tunnel */
#define ETH_P_ATMFATE 0x8884 /* Frame-based ATM Transport
* over Ethernet
*/
#define ETH_P_PAE 0x888E /* Port Access Entity (IEEE 802.1X) */
#define ETH_P_AOE 0x88A2 /* ATA over Ethernet */
#define ETH_P_TIPC 0x88CA /* TIPC */
#define ETH_P_1588 0x88F7 /* IEEE 1588 Timesync */
#define ETH_P_FCOE 0x8906 /* Fibre Channel over Ethernet */
#define ETH_P_FIP 0x8914 /* FCoE Initialization Protocol */
#define ETH_P_EDSA 0xDADA /* Ethertype DSA [ NOT AN OFFICIALLY REGISTERED ID ] */
#define ETH_P_AF_IUCV 0xFBFB /* IBM af_iucv [ NOT AN OFFICIALLY REGISTERED ID ] */

#define ETH_P_802_3 0x0001 /* Dummy type for 802.3 frames */
#define ETH_P_AX25 0x0002 /* Dummy protocol id for AX.25 */
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
#define ETH_P_802_2 0x0004 /* 802.2 frames */
#define ETH_P_SNAP 0x0005 /* Internal only */
#define ETH_P_DDCMP 0x0006 /* DEC DDCMP: Internal only */
#define ETH_P_WAN_PPP 0x0007 /* Dummy type for WAN PPP frames*/
#define ETH_P_PPP_MP 0x0008 /* Dummy type for PPP MP frames */
#define ETH_P_LOCALTALK 0x0009 /* Localtalk pseudo type */
#define ETH_P_CAN 0x000C /* Controller Area Network */
#define ETH_P_PPPTALK 0x0010 /* Dummy type for Atalk over PPP*/
#define ETH_P_TR_802_2 0x0011 /* 802.2 frames */
#define ETH_P_MOBITEX 0x0015 /* Mobitex 
(kaz@cafe.net) */
#define ETH_P_CONTROL 0x0016 /* Card specific control frames */
#define ETH_P_IRDA 0x0017 /* Linux-IrDA */
#define ETH_P_ECONET 0x0018 /* Acorn Econet */
#define ETH_P_HDLC 0x0019 /* HDLC frames */
#define ETH_P_ARCNET 0x001A /* 1A for ArcNet :-) */
#define ETH_P_DSA 0x001B /* Distributed Switch Arch. */
#define ETH_P_TRAILER 0x001C /* Trailer switch tagging */
#define ETH_P_PHONET 0x00F5 /* Nokia Phonet frames */
#define ETH_P_IEEE802154 0x00F6 /* IEEE802.15.4 frame */
#define ETH_P_CAIF 0x00F7 /* ST-Ericsson CAIF protocol */

l  通过tcpdump抓包,可以确认产品网络中有两种未知报文:

14:30:02.668266 00:18:82:c7:ca:38 (oui Unknown) > Broadcast, ethertype Unknown (0x88aa), length 64:
0x0000: 0002 0d01 0000 0000 0000 0000 0000 0000 ................
0x0010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0020: 0000 0000 0000 0000 0000 0000 0000 5551 ..............UQ
0x0030: 6ed5 n.

根因

3.0内核上,内核针对支持的协议都会添加到一个链表里,接收到数据包后内核会取协议字段进行比较,如果协议字段的值跟链表里存的值不匹配就丢弃并添加一个drop统计。3.0以前也会丢弃,但是没有加drop统计。

解决方案

SUSE11SP3内核对未知类型报文统计到网卡drop,是内核的新机制,对业务无影响,可不关注。

建议与总结

对于3.0及以上内核,网卡出现drop包并不一定都是网卡自身丢弃,也有可能内核丢弃。此问题可能涉及所有型号服务器

END