Recommended Networking
The following networking modes are recommended for AI Fabric in the AI GPU scenario:
- Single-layer networking
In Figure 3-1, single-layer networking is deployed in a PoD. Servers are directly connected to leaf switches through 100GE links, and the servers form a GPU compute cluster. Server NICs support RoCEv2. RoCEv2 traffic between servers is transmitted only in the PoD.
If the number of server nodes is less than or equal to 64, you are advised to use single-layer networking and deploy the CE8850-64CQ-EI as the leaf switch.
- Two-layer networking
As shown in Figure 3-2, the two-layer Clos architecture (spine-leaf) is deployed in a PoD, and leaf and spine nodes are fully meshed through 100GE links. Servers are connected to leaf switches through 100GE links. Server NICs support RoCEv2. RoCEv2 traffic between servers is transmitted only in the PoD.
To ensure ultra-low latency and high throughput, servers can be single-homed. Table 3-1 lists the recommended device models based on the number of server nodes.
Table 3-1 Recommended device models on the two-layer networking in the AI GPU scenarioNumber of Server Nodes (N)
Leaf
Spine
N ≤ 1024
CE8861-4C-EI
CE8850-64CQ-EI
N > 1024
CE8861-4C-EI
CloudEngine 16800 (equipped with CE-MPUE series MPUs)
Description
- The oversubscription ratio (downlink bandwidth:uplink bandwidth) is 1:1.
- Currently, only IPv4 underlay networking is supported.
- If a large Layer 2 network is required, it is recommended that VXLAN be used to implement the overlay large Layer 2 network. In this case, the ECN overlay function needs to be configured.
Here, a network with 1024 servers is used as an example. On the network, the CE8850-64CQ-EI is used as the spine switch, providing 64 x 100GE ports; the CE8861-4C-EI is used as the leaf switch, providing 32 x 100GE ports; the oversubscription ratio is 1:1. Servers are single-homed to a leaf switch. Based on the preceding switch models, the numbers of required switches are calculated as follows:
- Calculate the uplink bandwidth: Each leaf switch uses 16 x 100GE uplinks to connect to the spine switch. Therefore, the uplink bandwidth for each leaf switch is 16 x 100GE.
- Calculate the number of leaf switches: The oversubscription ratio is 1:1, and the uplink bandwidth is 16 x 100GE. If the access bandwidth is 100GE, 16 downlinks are required. That is, each leaf switch needs to connect to 16 servers. If 1024 servers are deployed and they are single-homed to leaf switches, a total of 64 (1024/16) leaf switches are required.
- Calculate the number of spine switches: There are 64 leaf switches, and each leaf switch has 16 x 100GE uplinks for connecting to spine switches. The CE8850-64CQ-EI provides 64 x 100GE interfaces. Therefore, a total of 16 spine switches are required.