PE routers are NE40E, P routers are NE80E, and SDH has been used as transmission between each device.
MPLS L3VPN has been used between PE routers.
MPLS TE and MPLS TE FRR have been used between P routers.
All NE40E/80E use the same software version which is V300R002C06B335.
Packet that larger than 1500 Bytes from CE device can not pass the MPLS L3VPN network. For example, CE1 connected to PE1;CE2 connected to PE2, ping between CE1 and CE2 can not work if the packet larger than 1500 Bytes, but for packet under 1500 Bytes, the ping is successful.
Following are information got from device when fault happened:
1. P1 interface to P2 has lots of CRC error and Symbol (For detailed log get from device, see at “Alarm information” part of this case)
2.CRC and Symbol error grow synchronous, every one account increases for CRC, one increase for Symbol. (Symbol means the physical layer problem)
<P1>disp int gi10/0/7
GigabitEthernet10/0/7 current state : UP
Line protocol current state : UP
Description : TO_RT19182P2_GE10/0/9, Route Port
The Maximum Transmit Unit is 1500 bytes
Internet Address is 126.96.36.199/30
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0018-824a-ebc0
The Vendor PN is FTRJ1319P1BTL
Port BW: 1G, Transceiver max BW: 1G, Transceiver Mode: SingleMode
WaveLength: 1310nm, Transmission Distance: 10km
Loopback:none, full-duplex mode, negotiation: disable, Pause Flowcontrol:Send and Receive Enable
Statistics last cleared: never
Last 30 seconds input rate: 248176 bits/sec, 421 packets/sec
Last 30 seconds output rate: 248536 bits/sec, 421 packets/sec
Input: 65296519999 bytes, 655754284 packets
Output: 45366592111 bytes, 615856417 packets
Unicast: 655461551, Multicast: 291674
Broadcast: 1059, Jumbo: 4768201
CRC: 3467237, Symbol: 3467237
Overrun: 0 , InRangeLength: 0
LongPacket: 0 , Jabber: 0, Alignment: 0
Fragment: 0, Undersized Frame: 0
Problem is solved by changing SDH device’s MTU to 1550Bytes. This time, we did not configure the MTU exactly same with 1534Bytes but a litter bigger. And this will not waste the bandwidth.
From the Fault phenomena, MTU has the most possibility be the reason of the problem.
Following actions has been implemented to analysis the cause:
1. Check MTU configuration of PE and P routers.
All PE and P routers use the same MTU size, IP=1500 Bytes. All packets, whatever how many labels it has, the MTU value is measured by IP packet size.
2. Calculate the largest packet exists in this network.
The largest packet exists in this network is:
1500Byte (IP packet) + 16Byte (4 MPLS Labels*) = 1516Byte
*4 MPLS labels happened only when traffic pass three P routers and TE FRR take effect simultaneity.
3. Check MTU configuration of SDH device.
SDH transmit device calculate MTU value including Ethernet header, adding with 18Byte Ethernet header, so from SDH point of view, the largest packet size in this network is 1534Byte.But customer configured their SDH with MTU 1524.