In the scenario of M-LAG devices dual-homed to VXLAN network, after customers upgrade the master SL of the M-LAG, services interruption happened accessing to M-LAG. After 15 minutes, the master SL finished version upgrade and rebooting, the services recovered.
In the M-LAG scenario, you can upgrade two devices separately and services accessing to M-LAG are not interrupted.
First, check the status of M-LAG and confirm that M-LAG group is normal.
At master SL, view service interruption network segment 172.31.8.96/27 ARP table, pick up one host,for example, 172.31.8.100.
Check the host routing details on BL1 and confirm that the routing generation time is after the M-LAG mater SL rebooting.This indicates that there is no route to 172.31.8.96/27 on BL1 when the M-LAG upgrading version.
Finally, check the network routing propagation on the M-LAG slave SL and confirm the configuration exists error. The routing of the network segment is advertised in the public ipv4-family and not advertised in the corresponding VRF address family.
On M-LAG slave SL the network segment routing propagation configuration is not right. This results in no routing to destination network segment after upgrade and reboot the master SL, so the services are interrupted. And when the master SL finished rebooting, the services are recovered.
Modify the M-LAG slave SL routing advertisement configuration, and advertise the routing of the network segment under the corresponding VRF address family in BGP.
Misconfiguration of M-LAG SL routing advertisement is one of the possible causes of service interruption in M-LAG DFS-Group device upgrade scenario.
We recommend one better version upgrade process should be:Firstly shut down the uplink and downlink interfaces of one device in the DFS group, that is, isolate one of them at first,and then confirm that the other one can forward traffic normally, next perform the isolation device version upgrade.