Recommended Networking
The following networking modes are recommended for AI Fabric in the HPC scenario:
- Single-layer networking
In Figure 4-1, single-layer networking is deployed in a PoD. Servers are connected to leaf switches through 100GE links. Server NICs support RoCEv2. RoCEv2 traffic between servers is transmitted only in the PoD.
If the number of server nodes is less than or equal to 64, you are advised to use single-layer networking and deploy the CE8850-64CQ-EI as the leaf switch.
- Two-layer networking
As shown in Figure 4-2, the two-layer Clos architecture (spine-leaf) is deployed in a PoD, and leaf and spine nodes are fully meshed through 100GE links. Servers are connected to leaf switches through 100GE links. Server NICs support RoCEv2. RoCEv2 traffic between servers is transmitted only in the PoD.
To ensure ultra-low latency, servers can be single-homed. Table 4-1 lists the recommended device models based on the number of server nodes. Currently, only IPv4 underlay networking is supported.
Table 4-1 Recommended device models on the two-layer networking in the HPC scenarioNumber of Server Nodes (N)
Leaf
Spine
N ≤ 3072
CE8850-64CQ-EI
CE8850-64CQ-EI
N > 3072
CE8850-64CQ-EI
CloudEngine 16800 (equipped with CE-MPUE series MPUs)
Description
- The oversubscription ratio (downlink bandwidth:uplink bandwidth) can be 4:3, 2:1, or 1:1 (recommended).
- Currently, only IPv4 underlay networking is supported.
- If a large Layer 2 network is required, it is recommended that VXLAN be used to implement the overlay large Layer 2 network. In this case, the ECN overlay function needs to be configured.
Here, a network with 1024 servers is used as an example. On the network, CE8850-64CQ-EI switches are used as the spine and leaf switches, providing 64 x 100GE ports; the oversubscription ratio is 1:1. Servers are single-homed to a leaf switch. Based on the preceding switch model, the numbers of required switches are calculated as follows:
- Calculate the uplink bandwidth: Each leaf switch uses 16 x 100GE uplinks to connect to the spine switch. Therefore, the uplink bandwidth for each leaf switch is 16 x 100GE.
- Calculate the number of leaf switches: The oversubscription ratio is 1:1, and the uplink bandwidth is 16 x 100GE. If the access bandwidth is 100GE, 16 downlinks are required. That is, each leaf switch needs to connect to 16 servers. If 1024 servers are deployed and they are single-homed to leaf switches, a total of 64 (1024/16) leaf switches are required.
- Calculate the number of spine switches: There are 64 leaf switches, and each leaf switch has 16 x 100GE uplinks for connecting to spine switches. The CE8850-64CQ-EI provides 64 x 100GE interfaces. Therefore, a total of 16 spine switches are required.