Stack Split and MAD
Stack Split
If you remove some member switches from a running stack without powering off the switches or if multiple stack cables fail, the stack splits into multiple stacks.
The previous master and standby switches are in the same stack after the split.
The previous master switch calculates the stack topology by deleting topology information related to the removed member switches, and synchronizes updated topology information to the other member switches. When removed member switches detect that the timeout timer for stack protocol packets has expired, the switches restart and begin a new master election.
In Figure 3-13, the previous master switch (SwitchA) and standby switch (SwitchB) are in the same stack after the stack split. SwitchA deletes topology information related to SwitchD and SwitchE and synchronizes topology information to SwitchB and SwitchC. After SwitchD and SwitchE restart, they set up a new stack.
The previous master and standby switches are in different stacks after the stack split.
The previous master switch selects a new standby switch in its stack, calculates stack topology information, and synchronizes updated topology information to the other member switches. The previous standby switch becomes the new master switch in its stack, calculates stack topology information, synchronizes stack topology information to the other member switches, and selects a new standby switch.
In Figure 3-14, the previous master switch (SwitchA) and standby switch (SwitchB) are in different stacks after the stack split. SwitchA specifies SwitchD as the new standby switch, calculates stack topology information, and synchronizes topology information to SwitchD and SwitchE. In the other stack, SwitchB becomes the master switch, calculates topology information, synchronizes topology information to SwitchC, and specifies SwitchC as the new standby switch.
Multi-Active Detection
All member switches in a stack use the same IP address and MAC address (stack MAC address). After a stack splits, more than one stack may use the same IP address and MAC address. To prevent this situation, a mechanism is required to check for IP address and MAC address collision after a split.
Multi-active detection (MAD) is a stack split detection protocol. If a stack splits due to a link failure, MAD provides split detection, multi-active handling, and fault recovery mechanisms to minimize the impact of a stack split on services.
MAD Modes
Direct mode
Use this mode when stack members have idle ports. In direct mode, stack members use direct links over ordinary network cables as dedicated MAD links. When the stack is running normally, member switches do not send MAD packets. After the stack splits, member switches send a MAD packet every 1s over a MAD link to check whether more than one master switch exists.
In direct mode, stack members can be directly connected to either an intermediate device or fully meshed with each other:Directly connected to an intermediate device (Figure 3-15): Each member switch has at least one MAD link connected to the intermediate device.
- Fully meshed with each other (Figure 3-16): In the full-mesh topology, at least one MAD link exists between any two member switches.
The use of an intermediate device can shorten the MAD links between member switches. This topology applies to stacks with a long distance between member switches. The full-mesh topology prevents MAD failures caused by intermediate device failures, but full-mesh connections occupy many interfaces on the member switches. Therefore, this topology applies to stacks with only a few member switches.
- After configuring MAD in direct mode on an interface, do not configure other services on the interface.
- A maximum of eight direct MAD links can be configured between member switches to ensure reliability.
- MAD packets are bridge protocol data units (BPDUs), so the intermediate device must be able to forward BPDUs. You need to run the l2protocol-tunnel user-defined-protocol command to enable the intermediate device to forward BPDUs. In this command, protocol-mac protocol-mac must be set to 0180-c200-000a. For details on how to configure this function, see Configuring Interface-based Layer 2 Protocol Tunneling in "Layer 2 Protocol Transparent Transmission Configuration" in the S2720, S5700, and S6700 V200R019C10 Configuration Guide - Ethernet Switching.
Relay mode
In relay mode, MAD relay detection is configured on an Eth-Trunk interface in the stack, and the MAD detection function is enabled on an agent. Every member switch must have a link to the agent and these links must be added to the same Eth-Trunk. In contrast to the direct mode, the relay mode does not require additional interfaces because the Eth-Trunk interface can run other services while performing MAD relay detection.
In relay mode, when the stack is running normally, member switches send MAD packets at an interval of 30s over the MAD links and do not process received MAD packets. After the stack splits, member switches send MAD packets at an interval of 1s over the MAD links to check whether more than one master switch exists.
You can use an independent relay agent (as in Figure 3-17) or use two stacks as each other's relay agents (as in Figure 3-18).
The relay agent is a switch that supports the MAD relay function. Currently, all the S series switches support this function.
To implement MAD relay detection by using two stacks as each other's relay agent, configure different domain IDs for the two stacks. Member switches of a stack form a stack domain. A network may have multiple stack domains, with different domain IDs.
Each Eth-Trunk can have a maximum of eight member interfaces. Therefore, when a stack contains nine member switches, one Eth-Trunk cannot provide MAD links for all the member switches. In this case, configure multiple Eth-Trunks to ensure that a MAD link is available between every two member switches. In Figure 3-19, Eth-Trunk1 provides MAD links for Switch1 through Switch8, Eth-Trunk2 provides MAD links for Switch2 through Switch9, and Eth-Trunk3 provides MAD links for Switch1 and Switch9.
Multi-Active Handling
After a stack splits, the MAD mechanism sets the new stacks to the Detect or Recovery state. The stack in Detect state still works, whereas the stack in Recovery state is disabled.
MAD handles a multi-active situation as follows: When multiple stacks in Detect state are detected by the MAD split detection mechanism, the stacks compete to retain the Detect state. The stacks that fail in the competition enter the Recovery state, and all the physical ports except the reserved ports on the member switches in these stacks are shut down. The stacks in Recovery state no longer forward service packets.
- The stack that completes its startup first enters the Detect state. If the difference in the time taken for multiple stacks to complete their startup is within 20 seconds, the stacks are considered to complete their startup at the same time.
- When the stacks complete their startup at the same time, the stack in which the master switch has the highest stack priority enters the Detect state.
- When the master switches in the stacks have the same stack priority, the stack with the smallest MAC address enters the Detect state.
Fault Recovery
- The stack in Recovery state restarts and merges with the stack in Detect state, and the service ports that have been shut down are restored to Up state. Then the entire stack recovers.
- If the stack in Detect state is also faulty before the faulty link recovers, you can remove this stack from the network and start the stack in Recovery state using a command to direct service traffic to this stack. Then rectify the stack system fault and link fault. After the stack recovers, connect it to the network so that it can merge with the other stack.