No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Software Installation 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Real-Time Stream Processing Scenario

Real-Time Stream Processing Scenario

Real-time stream processing means real-time, rapid data analysis to trigger next-step actions. Real-time data analysis has high requirements on processing speed. In addition, due to the large amount of data, the requirements on CPU and memory are high. In comparison, not much storage capacity is required because the data does not need to be stored in most cases.

In real-time stream processing scenarios, components such as Kafka, Yarn, Spark, Flink, and Redis need to be deployed, as shown in the following figure.

Figure 2-3 Real-time stream processing scenario

The configuration in the real-time stream processing scenario is as follows:

Table 2-4 Real-time stream processing scenario

Node Type

Server Configuration

Number of Nodes

Description

Management node

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 256 GB or above
  • Disk: 6 x 2.5-inch 600 GB SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card (supporting three or more RAID 1 groups)
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

2

Six disks on a single node form three RAID 1 groups. The functions are as follows (two nodes in total and same partitions on each node). For details about partitions, see Software Installation > Preparations for Installation > Preparing OS.

  • OS disk
  • /srv/BigData/dbdata_om partition
  • /srv/BigData/LocalBackup partition

Control node

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 256 GB or above
  • Disk:
    • 3 or 5 control nodes: 10 x 2.5-inch 600 GB SAS disks
    • 9 or 11 control nodes: 6 x 2.5-inch 600 GB SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card (supporting five or more RAID 1 groups)
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

3/5/9/11

  • The number of control nodes is calculated based on the number of data nodes. For details, see Software Installation > Installation Introduction > Solution Introduction > Installation Solution > Node Deployment Scheme.
  • Ten disks on a single node form five RAID 1 groups. The functions are as follows (15 RAID 1 groups are evenly distributed on three nodes). For details about partitions, see Software Installation > Preparations for Installation > Preparing OS.
    • OS disk x 3
    • /srv/BigData/dbdata_service x 2
    • /srv/BigData/zookeeper x 3
      NOTE:

      If the number of data nodes is less than or equal to 100, deploy three ZooKeeper nodes. If the number of data nodes is greater than 100, deploy five ZooKeeper nodes.

    • /srv/BigData/journalnode x 3
    • /srv/BigData/namenode x 2
    • /srv/BigData/storm x 2

Management + Control node

(Integrated deployment)

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 256 GB or above
  • Disk: 12 x 2.5-inch 600 GB SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card (supporting six or more RAID 1 groups)
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

3

In the integrated deployment scenario, 12 disks are used to form six RAID 1 groups, and shared disks are generated. By default, the /srv/BigData/journalnode partition and the /srv/BigData/storm partition are combined.

If the disk resources on the node are sufficient, you are advised to add a RAID 1 group, ensuring that all partitions occupy disks exclusively.

Kafka data node

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 128 GB or above
  • Disk:
    • 2 x 2.5-inch 600 GB SAS disks
    • 25 x 2.5-inch 1.2 TB SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card
    • Supports one or more RAID 1 groups.
    • Supports 25 or more RAID 0 groups or JBOD.
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

The number of nodes is calculated based on the throughput.

Cache for Kafka to forward messages for stream processing.

  • Two 600 GB SAS disks on a node form a RAID 1 group and are used as OS disks.
  • Twenty-five 1.2 TB SAS disk on a node form RAID 0 or no RAID groups.
  • Calculation of node quantity:
    • Based on the throughput:

      X (Total throughput, MB/s)/100 (Maximum Producer throughput per node, MB/s)/0.85 (Reservation ratio)

    • Based on the storage capacity:

      X (Total throughput, MB/s) x 3600 x 24 x D (Number of days, 7 by default) x 2 (Number of copies)/1024/1024/(25 x 1.2)/0.85 (Reservation ratio)

Use the larger value. At least two devices must be configured.

Flink/SparkStreaming node

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 256 GB or above
  • Disk:
    • 2 x 2.5-inch 600 GB SAS disks
    • 2 x 2.5-inch 600 GB (or above) SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card
    • Supports one or more RAID 1 groups.
    • Supports 2 or more RAID 0 groups or JBOD.
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

The number of nodes is calculated based on the computing amount.

Flink and Spark are computing engines used for stream processing.

Number of stream processing nodes (based on the number of Flink data nodes):

X (Total throughput, MB/s)/30 (Throughput per node, MB/s)/0.85 (Reservation ratio)

Number of stream processing nodes (based on the number of SparkStreaming data nodes):

X (Total throughput, MB/s)/10 (Throughput per node, MB/s)/0.85 (Reservation ratio)

At least three nodes must be configured.

Redis node

  • CPU:
    • X86: 2-socket 8-core CPU or above
    • Huawei TaiShan server: dual-socket 32-core 1616 processor or more
  • Memory: 512 GB or above
  • Disk:
    • 2 x 2.5-inch 600 GB SAS disks
    • 8 x 2.5-inch 600 GB SAS disks
  • RAID card: 1 GB LSI RAID 0/1 card
    • Supports one or more RAID 1 groups.
    • Supports 8 or more RAID 0 groups or JBOD.
  • NIC: Two access switches are connected after the bond is configured.
    • Management plane: Two GE ports are bonded.
    • Service plane: Two 10GE ports are bonded.

The number of nodes is calculated based on the data volume.

Redis nodes provide key/value distributed in-memory databases to cache stream-processed data. Number of Redis nodes:

M (Total data volume, GB) x 2 (two copies)/512 (Memory of a single node, GB)/0.85 (Reservation ratio)

At least three nodes must be configured.

Download
Updated: 2019-05-17

Document ID: EDOC1100074555

Views: 5960

Downloads: 6

Average rating:
This Document Applies to these Products
Related Version
Related Documents
Share
Previous Next