How Do I Switch from a Stack to an M-LAG?
- How Do I Switch from a Stack to an M-LAG?
- What Is Stacking?
- What Is M-LAG?
- Why Do I Need to Switch from a Stack to an M-LAG?
- What Checks Do I Need to Perform Before Switching from a Stack to an M-LAG?
- Checking Whether MSTP and Smart Link Configurations Take Effect
- Checking Whether ACL Resources Are Sufficient
- Checking Whether Layer 2 Sub-interface Resources Are Sufficient
- Checking Whether Services Are Single-Homing Services
- Checking License Consistency
- Checking Whether the Number of Established Neighbor Relationships of Dynamic Routing Protocols Exceeds the Upper Limit
- Checking Whether IPv6 Services Are Configured
- Checking Whether Inter-Device Layer 2 Port Isolation Takes Effect
- Checking Whether Inter-Device Unidirectional Services Are Configured
- Checking Whether the System MAC Address Is the MAC Address of the Master Device in the Stack
- Checking Whether the Network is an SDN Network
- Recording MAC Address Entry and Traffic Information
- Switching from a Stack to an M-LAG in Five Steps
How Do I Switch from a Stack to an M-LAG?
Stacking and Multichassis Link Aggregation Group (M-LAG) are two virtualization technologies widely used on data center networks to improve network reliability and scalability. M-LAG is recommended as the network access technology because of its advantages in upgrade convenience and reliability. This document describes how to switch from a stack to an M-LAG. After reading this document, you can understand the preparations for switching from a stack to an M-LAG and know how to complete the switching in five steps.
What Is Stacking?
Stacking technology virtualizes multiple stacking-capable devices into a logical device. You can manage and use these devices as a single device. Stacking allows you to expand the number of ports and the switching capacity by adding devices. In addition, the device reliability is improved because multiple member devices in a stack back up each other.
As shown in Figure 1-1, DeviceA and DeviceB are connected through stack links to establish a stack that functions as a logical device for data forwarding.
DeviceA and DeviceB back up each other. If DeviceA fails, DeviceB can take over the services of DeviceA to ensure normal operation of the system. For details about the fundamentals of stacking, see Stack Configuration in the Configuration Guide - Virtualization.
What Is M-LAG?
Multichassis Link Aggregation Group (M-LAG) is a new inter-device link aggregation technology. Two access switches in the same state in an M-LAG can perform link aggregation negotiation with a connected device. For the connected device, it establishes a link aggregation relationship with one device. M-LAG improves card-level reliability to device-level reliability.
As shown in Figure 1-2, DeviceA and DeviceB establish an M-LAG, and ServerA is dual-homed to the M-LAG member devices through inter-device link aggregation.
DeviceA and DeviceB work in load balancing mode to forward traffic. If DeviceA or DeviceB fails, traffic can be rapidly switched to the other device, ensuring non-stop service transmission. For details about the fundamentals of M-LAG, see M-LAG Configuration in Configuration Guide - Ethernet Switching.
M-LAG not only solves the problem of low reliability brought by traditional aggregated links, but also avoids the disadvantages of a long stack upgrade time and high risks.
Why Do I Need to Switch from a Stack to an M-LAG?
As two horizontal virtualization technologies that are widely used at the access layer of data center networks, stacking and M-LAG can implement redundant terminal access and link backup, improving the reliability and scalability of data center networks. Compared with stacking, M-LAG has higher reliability and the advantage of separately upgrading each member device.
Table 1-1 compares the advantages and disadvantages of stacking and M-LAG. In scenarios that require a short service interruption time during an upgrade and high network reliability, you are advised to use M-LAG technology to replace stacking technology and use it as the terminal access technology on your data center network.
Item |
Stacking |
M-LAG (Recommended) |
---|---|---|
Reliability |
Medium:
|
Higher:
|
Configuration complexity |
Simple: A stack is considered as one device logically. |
Simple: Two devices are configured independently. |
Cost |
Medium: Stack cables need to be deployed. |
Medium: Peer-link interfaces need to be connected. |
Performance |
Medium: The master device's control plane needs to control forwarding planes of all stack members, which increases the CPU load. |
High: Member switches forward packets independently. The CPU load remains unchanged. |
Upgrade complexity |
High complexity: Fast stack upgrade reduces the service interruption time but increases the upgrade operation time and upgrade risk. |
Low complexity: Two member devices can be upgraded separately. The upgrade operation is simple and the risk is low. |
Service interruption time during an upgrade |
Relatively long: In typical networking, the service interruption time during a fast stack upgrade is about 20 seconds to 1 minute, which is closely related to the service volume. |
Short: Traffic is interrupted in seconds. |
Network design |
Relatively simple: A stack is considered as one device logically, and the network structure is simple. |
Relatively complex: M-LAG member devices are still two devices logically, and the network structure is complex. |
Application scenario |
|
|
What Checks Do I Need to Perform Before Switching from a Stack to an M-LAG?
Checking Whether MSTP and Smart Link Configurations Take Effect
Check whether MSTP and Smart Link are configured. Run the display stp global command and check whether MSTP is enabled on the device and takes effect according to the Protocol Status and Mode fields in the command output. Run the display smart-link group command. If the Smart Link group field displays enabled, Smart Link has been enabled on the device. Because an M-LAG does not support MSTP and Smart Link, do not switch from a stack to an M-LAG when MSTP and Smart Link are enabled.
<HUAWEI> display stp global Protocol Status :Enabled //STP is enabled. Bpdu-filter default :Disabled Tc-protection :Enabled Tc-protection threshold :1 Tc-protection interval :2s Edged port default :Disabled Pathcost-standard :Dot1T Timer-factor :3 Transmit-limit :6 Bridge-diameter :7 CIST Global Information: Mode :MSTP //The STP mode is MSTP. CIST Bridge :32768.0019-7459-3301 Config Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 Active Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20 CIST Root/ERPC :32768.0019-7459-3301 / 0 (This bridge is the root) CIST RegRoot/IRPC :32768.0019-7459-3301 / 0 (This bridge is the root) CIST RootPortId :0.0 BPDU-Protection :Disabled TC or TCN received :9 TC count per hello :0 STP Converge Mode :Normal Share region-configuration :Enabled Time since last TC :0 days 1h:37m:17s Number of TC :10 Last TC occurred :10GE4/0/12 Topo Change Flag :0
# Display the status of a Smart Link group.
<HUAWEI> display smart-link group 1 Smart Link group 1 information : Smart Link group: enabled //The Smart Link group is enabled. Link status: Lock Wtr-time is: 60 sec. Load-Balance Instance: 10 Protected-VLAN reference-instance: -- DeviceID: 0025-9e80-2494 Control-VLAN ID: 505 Member Role InstanceID State Flush Count LastFlushTime --------------------------------------------------------------------------------- 10GE1/0/1 Master 0 Active 0 0000/00/00 00:00:00 UTC+00:00 //The member interface is in the forwarding state. 10GE1/0/1 Master 10 Inactive 0 0000/00/00 00:00:00 UTC+00:00 10GE1/0/2 Slave 0 Inactive 0 0000/00/00 00:00:00 UTC+00:00 10GE1/0/2 Slave 10 Active 0 0000/00/00 00:00:00 UTC+00:00
Checking Whether ACL Resources Are Sufficient
Check whether ACL resources on the device are sufficient. M-LAG occupies the following ACL resources:
- M-LAG occupies the group used by the CPCAR L2 service in the ingress direction, and the M-LAG ARP and M-LAG Protocol services are delivered.
- M-LAG occupies 160 bits of a group in the egress direction (320 bits in total in the egress direction of a single chip), including the resources for the delivered M-LAG IPv4 UC, M-LAG IPv6 UC, and M-LAG Isolate services.
The device provides a small number of ACL resources in the egress direction. Therefore, you need to pay special attention to these ACL resources.
The following uses the CE12800 as an example.
- Run the display system tcam service brief command to check the ID of groups and number of rules used by different services.
<HUAWEI> display system tcam service brief slot 1 Slot: 1 ------------------------------------------------------------------------------ Chip GroupID Width Stage ServiceName Count (FEI/FE) ------------------------------------------------------------------------------ 0 2/2 320Bit Ingress BPDU Deny 21 2/2 320Bit Ingress CPCAR L2 4 2/2 320Bit Ingress L2 Protocol Tunnel 1 3/3 320Bit Ingress App-Session 2 3/3 320Bit Ingress CPCAR L3 19 ------------------------------------------------------------------------------
There are 12 groups on the device. The GroupID field in the preceding command output shows that two groups are occupied and the remaining 10 groups are sufficient for the M-LAG ARP and M-LAG Protocol services.
- Run the display system tcam bank resource command to check the resource usage of each service.
<HUAWEI> display system tcam bank resource slot 1 Slot: 1 Chip: 0 ----------------------------------------------------------Bank Usage-------------------------------------------------------- BankId Entry Entry Entry Stage GroupId KBType KBId ServiceName Size Free Used (FEI/FE) ---------------------------------------------------------------------------------------------------------------------------- 0,1 320Bit 957 21 Ingress 2/2 L2 2,3 BPDU Deny 6 2/2 L2 2,3 CPCAR L2 1 2/2 L2 2,3 L2 Protocol Tunnel 4 3/3 IPv4 2,3 App-Session 31 3/3 IPv4 2,3 CPCAR L3 2 3/3 IPv4 2,3 VXLAN DFS 2,3 320Bit 1003 5 Ingress 4/8 IPv6 4,5 CPCAR Ipv6 2 6/1 L2 4,5 CPCAR Dci 6/1 IPv4 4,5 12 6/1 L2 4,5 EVN Packet 6/1 IPv4 4,5 1 6/1 L2 4,5 Ping Packet Pass 6/1 IPv4 4,5 4 160Bit 1020 3 Ingress 5/6 IPv4 1 CPCAR Terminated v4 5/6 L2 1 5/6 MPLS 1 5/6 IPv6 1 5 160Bit 1011 12 Ingress 8/7 MPLS 4 MPLS PHP 1 178/9 IPv4 7 CPCAR Vxlan Ipv6 6 - - -- 7 - - -- 8 - - -- 9 - - -- 10 - - -- 11 - - -- 12 - - -- 13 - - -- ---------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------KB Usage---------------------------------------------------------- KBType Total Used Free ---------------------------------------------------------------------------------------------------------------------------- MPLS 8 2(1,4) 6(0,2,3,5,6,7) L2 8 5(1,2,3,4,5) 3(0,6,7) IPv6 8 3(1,4,5) 5(0,2,3,6,7) IPv4 8 7(1,2,3,4,5,6,7) 1(0) ----------------------------------------------------------------------------------------------------------------------------
The Stage field in the preceding command output shows that no egress resource is occupied, indicating that there are sufficient resources for M-LAG in the egress direction. The commands in steps 3 to 9 are used to determine the services that occupy a large number of ACL resources when the command outputs in steps 1 and 2 show that resources are insufficient. You can then delete the services to release resources.
- Run the display system tcam resource command to check resource information about the external TCAM.
[~HUAWEI] display system tcam resource slot 3 Resource Detail Information: ------------------------------------------------------------------------------------ Slot Chip TCAM Service Banks Total Used Free ------------------------------------------------------------------------------------ 3 0 internal All 12 24576 566 24010 3 0 internal |- ACL -- -- 560 -- 3 0 internal |- UCv6Route -- -- 6 -- 3 0 internal |- MCv4Route -- -- 0 -- 3 0 internal |- MCv6Route -- -- 0 -- 3 1 internal All 12 24576 566 24010 3 1 internal |- ACL -- -- 560 -- 3 1 internal |- UCv6Route -- -- 6 -- 3 1 internal |- MCv4Route -- -- 0 -- 3 1 internal |- MCv6Route -- -- 0 -- 3 2 internal All 12 24576 566 24010 3 2 internal |- ACL -- -- 560 -- 3 2 internal |- UCv6Route -- -- 6 -- 3 2 internal |- MCv4Route -- -- 0 -- 3 2 internal |- MCv6Route -- -- 0 -- 3 3 internal All 12 24576 566 24010 3 3 internal |- ACL -- -- 560 -- 3 3 internal |- UCv6Route -- -- 6 -- 3 3 internal |- MCv4Route -- -- 0 -- 3 3 internal |- MCv6Route -- -- 0 -- ------------------------------------------------------------------------------------ Resource Template Information: ------------------------------------------------------------------------- Slot Type RunningTemplate NextTemplate ------------------------------------------------------------------------- 3 CE-L24LQ-EA -- -- -------------------------------------------------------------------------
- Run the display system tcam resource acl command to check resource information about the TCAM.
<HUAWEI> display system tcam resource acl slot 1 -------------------------------------------------------------------------------- Slot Chip TCAM Resource Stage Total Used Limited Free -------------------------------------------------------------------------------- 1 0 Internal Banks Ingress+Egress 12 2 2 10 1 0 Internal Rules Ingress+Egress 24576 128 3968 20480 1 0 Internal Meters Ingress+Egress 65536 0 0 65536 1 0 Internal Counters Ingress 16384 0 0 16384 1 0 Internal Counters Egress 2816 0 0 2816 1 1 Internal Banks Ingress+Egress 12 2 2 10 1 1 Internal Rules Ingress+Egress 24576 128 3968 20480 1 1 Internal Meters Ingress+Egress 65536 0 0 65536 1 1 Internal Counters Ingress 16384 0 0 16384 1 1 Internal Counters Egress 2816 0 0 2816 --------------------------------------------------------------------------------
- Run the display system tcam acl group resource command to check the resource usage of each service.
<HUAWEI> display system tcam acl group resource slot 1 STG : Stage KCP : Key Construction Program ING : Ingress EGR : Egress CYC : Cycle PTYPE: PortType FRT : Front Ports RCY : Recycle Ports 16-L: 16bit-LSB Copy Engine 16-M : 16bit-MSB Copy Engine 32-L: 32bit-LSB Copy Engine 32-M : 32bit-MSB Copy Engine F : Free T : Total -------------------------------------------------------------------------------- Slot: 1 Chip : 0 UseRate:Normal -------------------------------------------------------------------------------- STG KCP PacketType PTYPE CYC Group UsedKey 16-L 32-L 16-M 32-M F|T F|T F|T F|T -------------------------------------------------------------------------------- ING 1 L2 FRT 0 2 2,3 2|8 5|8 7|8 6|8 ING 2 IPV4 FRT 0 3 2,3 0|8 4|8 4|8 6|8 ING 3 TRILL FRT 0 1 2,3 6|8 5|8 7|8 6|8 ING 4 IPV6 FRT 0 4 2,3 2|8 5|8 1|8 6|8 --------------------------------------------------------------------------------
- Run the display system tcam fail-record command to check service delivery failure records of the TCAM.
<HUAWEI> display system tcam fail-record ----------------------------------------------------------------------------------- Slot Chip Time Service ErrInfo ----------------------------------------------------------------------------------- 1 1 2019-03-24 06:40:11 Traffic Policy VLAN Group resource full ----------------------------------------------------------------------------------- Total: 1
- Run the display system forwarding resource command to check the usage of key chip resources.
[~HUAWEI] display system forwarding resource slot 4 Local Common Hardware Forwarding Tables: ------------------------------------------------------------------------------- Slot Chip Name Total Remain Used[ %] ------------------------------------------------------------------------------- 4 0 LEM 262144 262142 2[ 0%] 4 0 - MAC 1[ 0%] 4 0 - IP host 0[ 0%] 4 0 - ILM 1[ 0%] 4 1 LEM 262144 262136 8[ 0%] 4 1 - MAC 7[ 0%] 4 1 - IP host 0[ 0%] 4 1 - ILM 1[ 0%] ------------------------------------------------------------------------------- 4 0 LPM 32768 32768 0[ 0%] 4 0 - IPv4 UC 0[ 0%] 4 0 - IPv4 MC 0[ 0%] 4 0 - IPv6 UC 0[ 0%] 4 0 - IPv6 MC 0[ 0%] 4 1 LPM 32768 32768 0[ 0%] 4 1 - IPv4 UC 0[ 0%] 4 1 - IPv4 MC 0[ 0%] 4 1 - IPv6 UC 0[ 0%] 4 1 - IPv6 MC 0[ 0%] ------------------------------------------------------------------------------- ......
- Run the display traffic-policy applied-record command to check traffic policy application records.
<HUAWEI> display traffic-policy applied-record Total records : 4 -------------------------------------------------------------------------------- Policy Type/Name Apply Parameter Slot State -------------------------------------------------------------------------------- dsc Global(IN) 1 fail(3) 2 fail(3) 4 fail(3) n4 10GE4/0/2(IN) 4 fail(4) p1 10GE4/0/5(IN) 4 fail(4) -------------------------------------------------------------------------------- Fail reason: 3 -- The numbers of matched conditions and actions in the traffic policy exceed the limit. 4 -- Insufficient ACL resources.
- Run the display system tcam acl resource key-buffer command to check the KB resource usage of current ACL resources.
<HUAWEI> display system tcam acl resource key-buffer verbose KB : Key Buffer Slot: 1 -------------------------------------------------------------------------------- Chip Direction ServiceName Group KBType UsedKBID -------------------------------------------------------------------------------- 0 Ingress BPDU Deny 2 L2 2,3 Ingress CPCAR 2 L2 2,3 Ingress L2 Protocol Tunnel 2 L2 2,3 Ingress App-Session 3 IPv4 2,3 Ingress CPCAR 3 IPv4 2,3 Ingress ECMP Hash 105 IPv4 4 Ingress LAG Hash 119 IPv4 4 Ingress Traffic Policy VLAN 294 IPv4 5 --------------------------------------------------------------------------------
For details about the commands used on modular switches, see the preceding commands of the CE12800. For details, see the CloudEngine series switches product documentation.
The commands used on fixed switches are the display system tcam resource acl, display system tcam fail-record, display system tcam bank resource, display system tcam service brief, and display traffic-policy applied-record commands.
Checking Whether Layer 2 Sub-interface Resources Are Sufficient
In a scenario where an M-LAG is connected to a VXLAN network, after a VNI is bound to a bridge domain (BD) on an M-LAG member device, an implicit Layer 2 sub-interface is created on a peer-link interface. The number of implicit Layer 2 sub-interfaces is the same as the number of VNIs bound to BDs. Therefore, before switching from a stack to an M-LAG, check whether the total number of VNI-bound BDs and Layer 2 sub-interfaces on the device exceeds the maximum number of Layer 2 sub-interfaces supported by the device. For details about specifications, visit https://support.huawei.com/onlinetoolweb/sqt/index to use the specifications query tool.
If the total number exceeds the maximum number, check whether the existing Layer 2 sub-interfaces can be deleted.
- If the Layer 2 sub-interfaces can be deleted, delete them to release resources so that the total number of VNI-bound BDs and Layer 2 sub-interfaces do not exceed the maximum number of Layer 2 sub-interfaces supported by the device.
- If the Layer 2 sub-interfaces cannot be deleted, do not switch from the stack to an M-LAG.
# Display the number of BDs to which VNIs are bound on the device.
<HUAWEI>display bridge-domain binding-info -------------------------------------------------------------------------------- BDID VNI VSI EVPN -------------------------------------------------------------------------------- 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 --------------------------------------------------------------------------------
# Display the number of Layer 2 sub-interfaces on the device.
<HUAWEI>display current-configuration | in mode l2 interface Eth-Trunk10.1 mode l2 interface Eth-Trunk10.2 mode l2 interface Eth-Trunk10.3 mode l2 interface Eth-Trunk12.1 mode l2 interface Eth-Trunk200.1 mode l2
Checking Whether Services Are Single-Homing Services
As shown in Figure 1-3, when a stack is working properly, services on the network are classified into dual-homing and single-homing services. Dual-homing service traffic is forwarded by two stack members, and single-homing service traffic is forwarded by one stack member. The two types of services are processed as follows when the stack is switched to an M-LAG:
- Dual-homing services: If the bandwidth of a member device is sufficient for transmitting services on the entire network, the stack can be switched to an M-LAG.
- Single-homing services: If single-homing services cannot be interrupted, switch the single-homing services to the backup device to ensure that the services are running properly, and then switch from the stack to an M-LAG.
If single-homing services can be interrupted, you can interrupt the services before the switching. After the stack is successfully switched to an M-LAG, restore the services on the M-LAG.
Checking License Consistency
In a stack of modular switches, run the display license [ verbose ] chassis chassis-number command to check the license of each stack member.
In a stack of fixed switches, run the display license [ verbose ] slot slot-id command to check the license of each stack member. Ensure that all stack members have a license installed.
- If a member device does not have a license, activate the license file to prevent service delivery failures after the stack splits.
- Upload the license file to the device.
For details on how to apply for and upload a license, see the License Usage Guide.
- Activate the license file to obtain authorization for functions.
license active filename
- Upload the license file to the device.
- If all stack members do not have a license, ignore this check item.
<HUAWEI> display license slot 0 slot 0: Active License : flash:/CloudEngine7800.dat License state : Normal Revoke ticket : No ticket RD of Huawei Technologies Co., Ltd. Product name : CloudEngine 7800 Product version : V100R006 License Serial No : LIC201411261KSH50 Creator : Huawei Technologies Co., Ltd. Created Time : 2014-11-26 09:09:51 Feature name : CELIC Authorize type : demo Expired date : 2015-02-20 Trial days : - Item name Item type Value Description ------------------------------------------------------------- CE-LIC-VXLAN Function YES CE-LIC-VXLAN License state: Demo. The license for the current configuration will expire in 86 day(s). Apply for authentic license before the current license expires.
<HUAWEI> display license chassis 1/2 Active License : flash:/LICCloudEngine12800_V200R019_20190725UMKG5T.xml License state : Normal Revoke ticket : No ticket RD of Huawei Technologies Co., Ltd. Product name : CloudEngine 12800 Product version : V200R019 License Serial No : LIC20190725UMKG5T Creator : Huawei Technologies Co., Ltd. Created Time : 2019-07-25 14:36:20 Feature name : CELIC Authorize type : comm Expired date : PERMANENT Trial days : 60 Item name Item type Value Description ------------------------------------------------------------- CE128-LIC-IPV6 -- 1 CloudEngine 12800 IPv6 Function CE128-LIC-IPv6 Function YES CE128-LIC-IPv6 CE128-LIC-TLM -- 1 CE12800 Telemetry Function CE-LIC-TLM Function YES CE-LIC-TLM CE128-LIC-VS -- 1 CloudEngine 12800 Virtual System Function DE0S0000VS01 Function YES CE128-LIC-VS Master board license state: Trial. The trial days remains 60 day(s). Apply for authentic license before the current license expires.
Checking Whether the Number of Established Neighbor Relationships of Dynamic Routing Protocols Exceeds the Upper Limit
Check whether a dynamic routing protocol is configured in the stack. If so, check whether the number of established neighbor relationships of the dynamic routing protocol exceeds the upper limit.
- In V200R019C10 and earlier versions, a device connected to an M-LAG through M-LAG member interfaces cannot communicate with the M-LAG through a dynamic routing protocol. Therefore, you need to check whether dynamic routing protocols are configured on Eth-Trunk interfaces whose member interfaces are on different stack members.
- Run commands to query dynamic routing protocol information. Dynamic routing protocols include RIP, OSPF, BGP, and IS-IS.# Run the display rip neighbor command to view RIP neighbor information. Number of RIP routes indicates the number of RIP neighbors.
<HUAWEI> display rip 1 neighbor ---------------------------------------------------------------- IP Address Interface Type Last-Heard-Time ---------------------------------------------------------------- 10.1.1.1 Vlanif100 RIP 0:0:7 Number of RIP routes:1 //Number of RIP neighbors
# Run the display ospf peer command to view information about neighbors in each OSPF area. Total number of peer(s) indicates the number of OSPF neighbors.<HUAWEI> display ospf 1 peer brief OSPF Process 1 with Router ID 10.10.10.1 Peer Statistic Information Total number of peer(s): 1 //Number of OSPF neighbors Peer(s) in full state: 1 ---------------------------------------------------------------------------- Area Id Interface Neighbor id State 0.0.0.0 Vlanif10 10.10.10.3 Full
# Run the display ospfv3 peer command to view OSPFv3 neighbor information. Total number of peer(s) indicates the number of OSPFv3 neighbors.<HUAWEI> display ospfv3 1 peer vlanif 10 OSPFv3 Process (1) Total number of peer(s): 1 //Number of OSPFv3 neighbors Peer(s) in full state: 1 OSPFv3 Area (0.0.0.0) Neighbor ID Pri State Dead Time Interface Instance ID 10.1.1.1 1 Full/ - 00:00:30 Vlanif10 0
# Run the display isis peer command to view IS-IS neighbor information. Total Peer(s) indicates the number of IS-IS neighbors.<HUAWEI> display isis peer Peer Information for ISIS(1) -------------------------------------------------------------------------------- System ID Interface Circuit ID State HoldTime(s) Type PRI -------------------------------------------------------------------------------- 0000.0000.0002 10GE1/0/9 0000.0000.0001.01 Up 23 L1(L1L2) 64 0000.0000.0003 10GE1/0/9 0000.0000.0001.01 Up 27 L1 64 0000.0000.0002 10GE1/0/9 0000.0000.0001.01 Up 23 L2(L1L2) 64 0000.0000.0004 10GE1/0/9 0000.0000.0001.01 Up 23 L2 64 Total Peer(s): 4 //Number of IS-IS neighbors
# Run the display bgp peer command to view BGP peer information. Total number of peers indicates the number of BGP peers.<HUAWEI> display bgp peer Status codes: * - Dynamic BGP local router ID : 10.2.3.4 Local AS number : 10 Total number of peers : 2 //Number of BGP peers Peers in established state : 1 Total number of dynamic peers : 0 Peer V AS MsgRcvd MsgSent OutQ Up/Down State PrefRcv 10.1.1.1 4 100 0 0 0 00:00:07 Idle 0 10.2.5.6 4 200 32 35 0 00:17:49 Established 0
- If any of the preceding dynamic routing protocols is configured, the dynamic routing protocol configurations cannot be directly inherited to an M-LAG, and can be changed to static route configurations (configured using the ip route-static or ipv6 route-static command). For details on how to configure static routes, see Static Route Configuration in the Configuration Guide - IP Unicast Routing.
- Run commands to query dynamic routing protocol information. Dynamic routing protocols include RIP, OSPF, BGP, and IS-IS.
- As shown in Figure 1-4, assume that the stack establishes an OSPF neighbor relationship with the upstream device, and the number of OSPF neighbor relationships on the upstream device reaches the upper limit. Switching from a stack to an M-LAG is equivalent to switching from one device to two devices. After the stack is switched to an M-LAG, each M-LAG member device establishes an OSPF neighbor relationship with the upstream device. As a result, the number of OSPF neighbor relationships on the upstream device will exceed the upper limit. To prevent a neighbor relationship establishment failure caused by the threshold crossing problem, instead of establishing two OSPF neighbor relationships between both M-LAG member devices and the upstream device, establish one OSPF neighbor relationship between one M-LAG member device and the upstream device, and establish a Layer 3 direct link between M-LAG member devices to establish an OSPF neighbor relationship between them. In addition, the bandwidth of the new link must meet the capacity requirement.
Checking Whether IPv6 Services Are Configured
Check the version of member devices in the stack and whether IPv6 services are configured on the devices. If the devices run a version earlier than V200R005C10 and are configured with IPv6 services such as DHCPv6 services, upgrade the system software of the devices to V200R005C10 or a later version, and then switch from the stack to an M-LAG. Otherwise, IPv6 services do not take effect in the M-LAG. (For details on how to upgrade a stack, see Upgrading Stack Software in the Configuration Guide - Virtualization.)
# Display the version of a device.
<HUAWEI> display version Huawei Versatile Routing Platform Software VRP (R) software, Version 8.200 (CloudEngine 6800 V200R005C10) Copyright (C) 2012-2020 Huawei Technologies Co., Ltd. ......
<HUAWEI> display dhcpv6 pool pool1 DHCPv6 pool: pool1 Address prefix: FC00:2::/64 Lifetime 172800 seconds, preferred 86400 seconds 100 in use, 0 conflicts //The DHCPv6 server service is configured. Information refresh time: 86400 DNS server address: FC00:2::3 DNS server domain name: huiwei.com Conflict-address expire-time: 172800 renew-time-percent : 50 rebind-time-percent : 80 Active normal clients: 0
Checking Whether Inter-Device Layer 2 Port Isolation Takes Effect
Run the display port-isolate group command to check the configuration of an inter-device Layer 2 port isolation group in the stack.
<HUAWEI> display port-isolate group all The ports in isolate group 1: 10GE1/0/1 10GE2/0/2
Layer 2 port isolation disables Layer 2 ports in the same VLAN from communicating with each other. As shown in Figure 1-5, PC1, PC2, and PC3 belong to VLAN 10. After 10GE1/0/1 and 10GE2/0/2 connected to PC1 and PC2 respectively are added to a port isolation group, PC1 and PC2 in VLAN 10 cannot communicate with each other.
However, Layer 2 ports on M-LAG master and backup devices are not isolated by default. You can use ACLs or VLANs to isolate the ports.
- ACL-based isolation: An ACL can be configured to prevent traffic from PC1 from being forwarded to PC2. In this case, ACL resources are occupied.
- VLAN-based isolation: Ports connected to PC1 and PC2 can be added to different VLANs for isolation. However, when the networking environment is complex and a large number of ports need to be isolated, the cost of VLAN-based isolation increases.
Checking Whether Inter-Device Unidirectional Services Are Configured
Inter-device unidirectional services include inter-device redirection and inter-device mirroring. If these two types of services are configured, you need to change the traffic model so that traffic is sent out from one device.
- Check whether inter-device redirection services are configured.
- Check the MQC-based inter-device redirection configuration.
# Display the traffic policy application record.
[~SwitchA] display traffic-policy applied-record Total records : 1 ------------------------------------------------------------------------------- Policy Type/Name Apply Parameter Slot State ------------------------------------------------------------------------------- p1 10GE1/0/1(IN) 1 success -------------------------------------------------------------------------------
# Display the traffic policy configuration.
<SwitchA> display traffic policy Traffic Policy Information: Policy: p1 Classifier: c1 Type: OR Behavior: b1 Redirect: Redirect interface 10GE2/0/3 Total policy number is 1
If the Apply Parameter and Redirect interface fields display interfaces on different stack member devices, inter-device redirection is configured.
- Check the ACL-based inter-device redirection configuration.
# Display the traffic policy application record.
[~SwitchA] display traffic-policy applied-record Total records : 1 ------------------------------------------------------------------------------- Policy Type/Name Apply Parameter Slot State ------------------------------------------------------------------------------- traffic-redirect 10GE1/0/1 inbound 1 success -------------------------------------------------------------------------------
# Display the traffic policy configuration.
[~SwitchA] interface 10ge 1/0/1 [~SwitchA-10GE1/0/1] display this | in traffic-redirect [~SwitchA-10GE1/0/1] traffic-redirect acl 4001 interface 10ge 2/0/3 inbound
If the Apply Parameter and traffic-redirect fields display interfaces on different stack member devices, inter-device redirection is configured.
- Check the MQC-based inter-device redirection configuration.
- Check whether inter-device mirroring services are configured.
- Check the observing port configured on the device.# Display the observing port configuration.
<HUAWEI> display observe-port ----------------------------------------------------------------------------- Index : 1 Slot: 1 Interface: 100GE1/0/3 -----------------------------------------------------------------------------
- Check the mirroring configuration on the device.# Display the mirroring configuration.
<HUAWEI> display port-mirroring Observe port mirroring: --------------------------------------------------------------- MirroringPort Direction ObservePort --------------------------------------------------------------- 100GE4/0/10 Inbound 1 --------------------------------------------------------------- Traffic mirroring: --------------------------------------------------------------- TrafficBehavior ObservePort --------------------------------------------------------------- b 1
When the observing port and mirrored port are on different stack member devices, inter-device mirroring is configured.
- Check the observing port configured on the device.
Inter-device mirroring is used as an example. As shown in Figure 1-6, the observing port and a mirrored port are located on different devices. In this case, you can run the observe-port command to configure an observing port on the device where the mirrored port resides to prevent mirrored traffic loss after the stack is switched to an M-LAG because traffic on the device cannot be mirrored to the observing port on the other device.
Checking Whether the System MAC Address Is the MAC Address of the Master Device in the Stack
Run the display system mac-address command to check whether the system MAC address is the MAC address of the master device in the stack according to the Stack MAC Information field in the command output. If the Stack MAC Information field is displayed, the system MAC address is not the MAC address of the master device. In this case, run the undo set system mac-address command to restore the MAC address of the device to the default value. If the system MAC address is not the MAC address of the master device, the purpose of this operation is to prevent a system MAC address conflict between the two stack member devices after the stack splits.
[HUAWEI]display system mac-address Current System MAC address: 8446-fea1-dd20(Used Stack MAC) Current System MAC number : 16 User-configured MAC address: -- User-configured MAC number : -- System MAC Switch : Enable System MAC Switch-delay: 10(minutes) System MAC Inconsistence-alarm : Enable System MAC Inconsistence-alarm Delay: 10(minutes) Manufacture MAC Information: -------------------------------------------------------------- Slot MAC Number -------------------------------------------------------------- 1 8446-fea1-dd20 16 2 8446-fea1-e480 16 -------------------------------------------------------------- Stack MAC Information: -------------------------------------------------------------- Slot MAC Number -------------------------------------------------------------- 1 8446-fea1-dd20 16 2 8446-fea1-dd20 16 --------------------------------------------------------------
After the system MAC address of a stack is changed, the stack does not need to be restarted. The system MAC address is updated, and one or two service packets are lost. In addition, an ARP entry update is triggered on the remote device.
Checking Whether the Network is an SDN Network
If Agile Controller controls and manages a stack and the stack is switched to an M-LAG, Agile Controller cannot identify the two M-LAG member devices, that is, it cannot control and manage the M-LAG. Perform the following steps:
- Migrate all services to the standby device.
- After the stack is switched to an M-LAG, add M-LAG member devices to Agile Controller.
- Deliver services to the devices through Agile Controller.
Recording MAC Address Entry and Traffic Information
Before switching from a stack to an M-LAG, check MAC address entry, interface traffic, and routing information. Compare the information with that before the switching. For details on how to query routing information, see Checking Whether the Number of Established Neighbor Relationships of Dynamic Routing Protocols Exceeds the Upper Limit. Run the following commands to check MAC address entry and interface traffic information.
<HUAWEI> display mac-address Flags: * - Backup BD : bridge-domain Age : dynamic MAC learned time in seconds ------------------------------------------------------------------------------- MAC Address VLAN/VSI/BD Learned-From Type Age ------------------------------------------------------------------------------- 0000-0000-0033 100/-/- 10GE1/0/1 dynamic 4294367295 0000-0000-0001 200/-/- 10GE1/0/2 static - ------------------------------------------------------------------------------- Total items: 2
<HUAWEI> display interface brief PHY: Physical *down: administratively down ^down: standby (l): loopback (s): spoofing (b): BFD down (e): ETHOAM down (d): Dampening Suppressed (p): port alarm down (dl): DLDP down (c): CFM down (sd): STP instance discarding InUti/OutUti: input utility rate/output utility rate Interface PHY Protocol InUti OutUti inErrors outErrors 10GE1/0/1 down down 0% 0% 0 0 10GE1/0/2 down down 0% 0% 0 0 10GE1/0/3 down down 0% 0% 0 0 10GE1/0/4 down down 0% 0% 0 0 40GE1/0/1 down down 0% 0% 0 0 40GE1/0/2 down down 0% 0% 0 0 Eth-Trunk0 down down 0% 0% 0 0 GE1/0/1 up up 0.01% 0.01% 0 0 GE1/0/2 up up 0.01% 0.01% 0 0 GE1/0/3 down down 0% 0% 0 0 GE1/0/4 down down 0% 0% 0 0 GE1/0/5 down down 0% 0% 0 0 GE1/0/6 up up 0.01% 0.01% 0 0 GE1/0/7 up up 0.01% 0.01% 0 0 GE1/0/8 down down 0% 0% 0 0 GE1/0/9 down down 0% 0% 0 0 GE1/0/10 down down 0% 0% 0 0 ---- More ----
Switching from a Stack to an M-LAG in Five Steps
Figure 1-7 shows how to switch from a stack to an M-LAG.
- Delete the dual-active detection (DAD) link.
- Isolate the standby device in the stack and configure M-LAG on the standby device.
- Switch services on the master device to the original standby device, isolate the master device, and configure M-LAG on the master device.
- The two devices establish an M-LAG.
- Check the status and service recovery after the switching.
Deleting the DAD Link
Dual-active detection (DAD) is a protocol that can detect stack split and dual-active situations and take recovery actions to minimize the impact on services. When a stack is working properly, a DAD link fault does not affect services. Therefore, you can delete the DAD link before switching from the stack to an M-LAG. Deleting the DAD link can prevent service interfaces on the standby device from entering the Error-Down state after the stack splits and prevent alarms generated when the stack is removed.
A DAD link can be configured on a service interface, Eth-Trunk, management interface, or stack interface. To delete a DAD link, perform one of the following operations:
- Delete the configuration of DAD in direct mode from a service interface.
system-view interface interface-type interface-number undo dual-active detect mode direct quit
- Delete the configuration of DAD in relay mode from an Eth-Trunk.
system-view interface eth-trunk trunk-id undo dual-active detect mode relay quit
- Delete the DAD configuration from a management interface.
system-view interface meth 0/0/0 undo dual-active detect enable quit
- Delete the DAD configuration from a stack interface.
system-view interface stack-port member-id/port-id undo dual-active detect mode direct quit
Isolating the Standby Device in the Stack and Adding the M-LAG Configuration
This section describes how to configure the standby device after it is isolated. To isolate the standby device in a stack, run the shutdown command on all service interfaces of the device. Perform the following steps to add the M-LAG configuration. You are advised to clear the existing configuration, and then configure M-LAG.
- Shut down all service interfaces on the standby device in the stack.
system-view interface interface-type interface-number shutdown
- Add the DFS group configuration.
system-view dfs-group dfs-group-id source ip ip-address [ vpn-instance vpn-instance-name ] [ peer peer-ip-address [ udp-port port-number ] ] [timeout seconds] //Configure a DAD link. quit
Select a proper link as the DAD link of the M-LAG. The link can be a link between management interfaces, a direct link, or an uplink for communication between M-LAG member devices.
- Configure the STP mode.
system-view stp mode rstp stp v-stp enable stp root primary stp bridge-address mac-address
When the device functions as the root bridge, you need to run the stp root primary and stp bridge-address mac-address commands.
- Configure a peer-link interface.
- If the stack is set up using stack cables, remove the stack cables, connect service interfaces, add them to an Eth-Trunk, and configure the Eth-Trunk as the peer-link interface.
system-view interface eth-trunk trunk-id peer-link peer-link-id //Configure a peer-link. quit
- Reuse the original stack link as the peer-link.
system-view interface eth-trunk trunk-id peer-link peer-link-id //Configure a peer-link. quit
- If the stack is set up using stack cables, remove the stack cables, connect service interfaces, add them to an Eth-Trunk, and configure the Eth-Trunk as the peer-link interface.
- Configure the original inter-device Eth-Trunk in the stack as an M-LAG member interface.
system-view interface eth-trunk trunk-id mode { lacp-static | lacp-dynamic } dfs-group dfs-group-id m-lag m-lag-id //Bind the Eth-Trunk to a DFS group, that is, configure the Eth-Trunk as an M-LAG member interface. quit
- Configure the virtual IP address and virtual MAC address of an active-active gateway.Configure an IPv4 active-active gateway.
system-view interface { vlanif vlan-id | vbdif bd-id } ip address ip-address { mask | mask-length } [ sub ] mac-address mac-address quit
Configure an IPv6 active-active gateway.system-view interface { vlanif vlan-id | vbdif bd-id } ipv6 enable ipv6 address { ipv6-address prefix-length | ipv6-address/prefix-length } [ eui-64 ] mac-address mac-address quit
This step is mandatory when the M-LAG is connected to a VXLAN or Layer 3 network.
- Configure a Monitor Link group.
system-view monitor-link group group-id port interface-type interface-number { downlink [ downlink-id ] | uplink } quit
- Change the router ID for OSPF and BGP.
system-view ospf [ process-id | router-id router-id | vpn-instance vpn-instance-name ] *
There is only one router ID for OSPF and BGP in the original stack. After the stack is switched to an M-LAG, you need to configure different router IDs for the two M-LAG member devices. Similarly, the router ID on the network-side device needs to be changed to two different router IDs for the device to establish neighbor relationships of dynamic routing protocols with the two M-LAG member devices respectively.
- Change the engine ID of the local SNMP agent.
system-view undo snmp-agent local-engineid
To connect the NMS to M-LAG member devices after the stack is switched to an M-LAG, you can delete the engine ID of the local SNMP agent on the device. The device will automatically generate a new engine ID.
Switching Services
After configuring the M-LAG backup device according to the previous section, perform the following steps to switch services from the master device in the stack to the original standby device.
- Check and record MAC address entry, interface traffic, and routing information of the master device in the stack.
display mac-address display interface brief display rip neighbor display ospf peer display ospfv3 peer display isis peer display bgp peer
- Shut down all interfaces on the master device in the stack.
system-view interface interface-type interface-number shutdown quit
- After step 2 is complete, immediately run the undo shutdown command to restore interfaces on the original standby device in the stack to Up state.
system-view interface interface-type interface-number undo shutdown quit
If the original standby device is the root bridge, STP convergence occurs on the entire network.
- Check whether services are running properly after the switching.
display mac-address display interface brief display rip neighbor display ospf peer display ospfv3 peer display isis peer display bgp peer
Check the outputs of the preceding commands and compare them with the command outputs in step 1. If services are abnormal, run the shutdown command on the original standby device and the undo shutdown command on the original master device to restore the traffic and check the configuration.
Isolating the Master Device in the Stack and Adding the M-LAG Configuration
Isolate the master device in the stack. Run the shutdown command to shut down all service interfaces of the master device in the stack. Perform the following steps to add the M-LAG configuration. You are advised to clear the existing configuration, and then configure M-LAG. For details on how to configure M-LAG, see the CloudEngine series switches product documentation.
- Shut down all service interfaces on the master device in the stack.
system-view interface interface-type interface-number shutdown
- Add the DFS group configuration.
system-view dfs-group dfs-group-id source ip ip-address [ vpn-instance vpn-instance-name ] [ peer peer-ip-address [ udp-port port-number ] ] [timeout seconds] //Configure a DAD link. quit
Select a proper link as the DAD link of the M-LAG. The link can be a link between management interfaces, a direct link, or an uplink for communication between M-LAG member devices.
- Configure the STP mode.
system-view stp mode rstp stp v-stp enable stp root primary stp bridge-address mac-address
When the device functions as the root bridge, you need to run the stp root primary and stp bridge-address mac-address commands.
- Configure a peer-link interface.
- If the stack is set up using stack cables, remove the stack cables, connect service interfaces, add them to an Eth-Trunk, and configure the Eth-Trunk as the peer-link interface.
system-view interface eth-trunk trunk-id peer-link peer-link-id //Configure a peer-link. quit
- Reuse the original stack link as the peer-link.
system-view interface eth-trunk trunk-id peer-link peer-link-id //Configure a peer-link. quit
- If the stack is set up using stack cables, remove the stack cables, connect service interfaces, add them to an Eth-Trunk, and configure the Eth-Trunk as the peer-link interface.
- Configure the original inter-device Eth-Trunk in the stack as an M-LAG member interface.
system-view interface eth-trunk trunk-id mode { lacp-static | lacp-dynamic } dfs-group dfs-group-id m-lag m-lag-id //Bind the Eth-Trunk to a DFS group, that is, configure the Eth-Trunk as an M-LAG member interface. quit
- Configure the virtual IP address and virtual MAC address of an active-active gateway.Configure an IPv4 active-active gateway.
system-view interface { vlanif vlan-id | vbdif bd-id } ip address ip-address { mask | mask-length } [ sub ] mac-address mac-address quit
Configure an IPv6 active-active gateway.system-view interface { vlanif vlan-id | vbdif bd-id } ipv6 enable ipv6 address { ipv6-address prefix-length | ipv6-address/prefix-length } [ eui-64 ] mac-address mac-address quit
This step is mandatory when the M-LAG is connected to a VXLAN or Layer 3 network.
- Configure a Monitor Link group.
system-view monitor-link group group-id port interface-type interface-number { downlink [ downlink-id ] | uplink } quit
- Change the router ID for OSPF and BGP.
system-view ospf [ process-id | router-id router-id | vpn-instance vpn-instance-name ] *
There is only one router ID for OSPF and BGP in the original stack. After the stack is switched to an M-LAG, you need to configure different router IDs for the two M-LAG member devices. Similarly, the router ID on the network-side device needs to be changed to two different router IDs for the device to establish neighbor relationships of dynamic routing protocols with the two M-LAG member devices respectively.
- Change the engine ID of the local SNMP agent.
system-view undo snmp-agent local-engineid
To connect the NMS to M-LAG member devices after the stack is switched to an M-LAG, you can delete the engine ID of the local SNMP agent on the device. The device will automatically generate a new engine ID.
- After the M-LAG master and backup devices are configured, run the undo shutdown command on interfaces of the M-LAG master device. The M-LAG is established.
system-view interface interface-type interface-number undo shutdown quit
Querying the Status After the Switching and Comparing Services Before and After the Switching
To check whether the M-LAG is working properly, you need to query the M-LAG heartbeat status, master/backup status, and M-LAG member interface status.
[~HUAWEI] display dfs-group 1 m-lag * : Local node Heart beat state : OK Node 1 * Dfs-Group ID : 1 Priority : 150 Address : ip address 10.3.3.4 State : Master Causation : - System ID : 0025-9e95-7c31 SysName : HUAWEIA Version : V200R019C10 Device Type : CE6850EI Node 2 Dfs-Group ID : 1 Priority : 120 Address : ip address 10.3.3.3 State : Backup Causation : - System ID : 0025-9e95-7c11 SysName : HUAWEIB Version : V200R019C10 Device Type : CE6850EI
[~HUAWEI] display dfs-group 1 node 1 m-lag brief
* - Local node
M-Lag ID Interface Port State Status Consistency-chec
1 Eth-Trunk 10 Up active(*)-active success
Failed reason:
1 -- Relationship between vlan and port is inconsistent
2 -- STP configuration under the port is inconsistent
3 -- STP port priority configuration is inconsistent
4 -- LACP mode of M-LAG is inconsistent
5 -- M-LAG configuration is inconsistent
6 -- The number of M-LAG members is inconsistent
In addition to checking the M-LAG status, you also need to check whether the service traffic is restored by querying MAC address entry, network-side routing, and total traffic information and comparing the information with that recorded in Recording MAC Address Entry and Traffic Information.
- How Do I Switch from a Stack to an M-LAG?
- What Is Stacking?
- What Is M-LAG?
- Why Do I Need to Switch from a Stack to an M-LAG?
- What Checks Do I Need to Perform Before Switching from a Stack to an M-LAG?
- Checking Whether MSTP and Smart Link Configurations Take Effect
- Checking Whether ACL Resources Are Sufficient
- Checking Whether Layer 2 Sub-interface Resources Are Sufficient
- Checking Whether Services Are Single-Homing Services
- Checking License Consistency
- Checking Whether the Number of Established Neighbor Relationships of Dynamic Routing Protocols Exceeds the Upper Limit
- Checking Whether IPv6 Services Are Configured
- Checking Whether Inter-Device Layer 2 Port Isolation Takes Effect
- Checking Whether Inter-Device Unidirectional Services Are Configured
- Checking Whether the System MAC Address Is the MAC Address of the Master Device in the Stack
- Checking Whether the Network is an SDN Network
- Recording MAC Address Entry and Traffic Information
- Switching from a Stack to an M-LAG in Five Steps