No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Product Description 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Software Deployment Scheme

Software Deployment Scheme

Software List

Table 3-5 shows the versions of the open-source components required by FusionInsight HD 6.5.0.

Table 3-5 Software list

Component

Version

DBService

2.7.0

Elasticsearch

6.1.3

Elk

6.5.0

Flink

1.4.0

Flume

1.6.0

FTP-Server

1.1.1

GraphBase

6.5.0

Hadoop (including HDFS/Mapreduce/Yarn)

3.1.1

HBase

1.3.1

Hive

3.1.0

Hue

3.11.0

Kafka

2.11-1.1.0

KrbServer

1.17

Loader(Sqoop)

1.99.3

LdapServer

2.7.0

Oozie

4.2.0

Phoenix

4.13.1

Redis

3.0.5

SmallFS

1.0.0

Solr

6.2.0

Spark

1.5.1

Spark2x

2.3.2

Storm

1.0.2

ZooKeeper

3.5.1

Deployment Principles

Table 3-6 shows deployment principles of service roles.

The dependency or association relationships between services or roles in the cluster are as follows:

  • Service A depending on service B: If service A is deployed in the cluster, service B must be deployed in advance. Service B provides basic capabilities for service A. In the multi-service scenario, if multiple B services are deployed, you need to specify the service B instance on which service A depends.
  • Service A associated with service B: Service A exchanges data with service B during service running, and their deployment does not depend on each other. In the multi-service scenario, if multiple B services are deployed, you need to specify the service B instance associated with service A.
  • Role A and role B deployed on the same server: If role A is deployed in the cluster, role B must also be deployed, and role A and role B must be deployed on the same node.
NOTE:
  • Federation: Only one pair of NameNode and Zkfc can be installed during cluster installation. If multiple pairs of NameNodes and Zkfcs need to be deployed for HDFS federation, manually add them after the cluster is installed.
  • Multi-service: The multi-service feature allows multiple components of the same type to be installed in the same cluster to better resolve resource isolation or performance problems. Services that support this feature are listed in the following table.
  • Hybrid deployment: In the hybrid deployment of x86 and ARM servers, some services do not allow multiple instances of a role to be installed on nodes on different platforms. The following table describes the details.
Table 3-6 Memory requirements and deployment principles of service roles

Service Name

Role Name

Minimum Memory

Dependency

Role Deployment Principle

Support Multi-Service

Support Hybrid Deployment

OMSServer

OMSServer

10 GB

-

Deploy OMSServers on two management nodes in active/standby mode.

No

No

LdapServer

SlapdServer (LS)

1 GB

-

  • OLdap service of OMS: Deploy LdapServers on two management nodes in active/standby mode.
  • LdapServer service: Deployed on at least 2 control nodes, with a maximum of 10 deployments. LS instances are backup instances of the OLdap service.

No

No

KrbServer

KerberosServer (KS)

3 MB

  • KrbServer depends on LdapServer.
  • The KerberosServer and KerberosAdmin are deployed on the same server.

Deploy KerberosServers on the same two control nodes as KerberosAdmins for load sharing.

No

No

KerberosAdmin (KA)

2 MB

Deploy KerberosAdmins on the same two control nodes as KerberosServers for load sharing.

ZooKeeper

Qquorumpeer (QP)

1 GB

-

Deploy three QPs on control nodes in each cluster. Ensure that the number of nodes that contain QP is an odd number if expansion is required. A maximum of nine instances are supported.

Yes

Yes

HDFS

NameNode (NN)

4 GB

  • The NameNode and Zkfc are deployed on the same server.
  • HDFS depends on ZooKeeper.

Deploy NameNodes on two control nodes in active/standby mode.

No

Yes

ZooKeeper FailoverController (Zkfc)

1 GB

Deploy Zkfcs on two control nodes in active/standby mode.

JournalNode (JN)

4 GB

Deploy JournalNodes on at least three control nodes. Each node stores one copy of backup data. To reserve three or more copies of backup data, deploy multiple JournalNodes on control nodes or data nodes. Ensure that the quantity is an odd number.

DataNode (DN)

4 GB

Deploy at least three DataNodes. You are advised to deploy DataNodes on data nodes.

Router

4 GB

Deploy Routers on at least three control nodes only in Federation scenarios. For details, see section "Federation Configuration" in Service Operation Guide under Federation.

HttpFS

128 MB

Deploy at most ten HttpFSs on the same nodes as DataNodes.

Yarn

ResourceManager (RM)

2 GB

Yarn depends on HDFS and ZooKeeper.

Deploy ResourceManagers on two control nodes in active/standby mode.

No

Yes

NodeManager (NM)

2 GB

Deploy NodeManagers on data nodes. The number of NodeManagers must be consistent with the number of HDFS DataNodes.

MapReduce

JobHistoryServer (JHS)

2 GB

MapReduce depends on Yarn, HDFS, and ZooKeeper.

  • In single-node deployment mode, deploy one JHS in one cluster on the control node. This mode is recommended to ensure compatibility with open source.
  • In HA mode, deploy JHSs on two control nodes in active/standby mode.

No

Yes

DBService

DBServer

512 MB

-

Deploy DBServers on two control nodes in active/standby mode.

Yes

No

Hue

Hue

1 GB

Hue depends on DBService.

NOTE:

In the Federation scenario, the Hue component can connect to the HttpFS for interface conversion. In this case, Hue also depends on the HttpFS role of HDFS.

Deploy Hues on two control nodes in active/standby mode.

No

NOTE:

The multi-service feature is not supported, but multiple services, such as HBase and Hive, can be managed using Hue.

No

Loader

LoaderServer (LS)

2 GB

Loader depends on MapReduce, Yarn, DBService, HDFS, and ZooKeeper.

Deploy Loaders on two control nodes in active/standby mode.

Yes

No

Spark

SparkResource (SR)

-

Spark depends on Yarn, Hive, HDFS, MapReduce, ZooKeeper, and DBService.

SR does not have an actual process and does not consume memory.

Deploy SRs on at least two control nodes in non-active/standby mode.

Yes

Yes

JobHistory (JH)

2 GB

Deploy JHs on two control nodes in non-active/standby mode.

JDBCServer (JS)

2 GB

Deploy JSs on at least two control nodes.

You can deploy JSs on multiple control nodes for load sharing.

Spark2x

SparkResource2 (SR2)

-

Spark2x depends on Yarn, Hive, HDFS, MapReduce, ZooKeeper, and DBService.

SR2 does not have an actual process and does not consume memory.

Deploy SR2s on at least two control nodes or data nodes in non-active/standby mode.

Yes

Yes

JobHistory2 (JH2)

2 GB

Deploy JH2s on two control nodes in non-active/standby mode.

JDBCServer2 (JS2)

2 GB

Deploy JS2s on at least two control nodes.

You can deploy JS2s on multiple control nodes for load sharing.

Hive

HiveServer (HS)

4 GB

Hive depends on DBService, MapReduce, HDFS, Yarn, and ZooKeeper.

Deploy HSs on at least two control nodes. Deploy HSs on multiple control nodes for load sharing.

Yes

Yes

MetaStore (MS)

2 GB

Deploy MSs on at least two control nodes. You can deploy MSs on multiple control nodes for load sharing.

WebHCat

2 GB

Deploy WebHCats on at least one control node. You can deploy WebHCats on multiple control nodes for load sharing.

HBase

HMaster (HM)

1 GB

HBase depends on HDFS, ZooKeeper, and Yarn.

Deploy HMs on two control nodes in active/standby mode.

Yes

Yes

RegionServer (RS)

6 GB

Deploy RSs on data nodes. The number of RSs must be consistent with the number of HDFS DataNodes.

ThriftServer (TS)

1 GB

Deploy TSs on three control nodes in each cluster. If there is a long delay when a TS accesses HBase and the delay cannot meet user requirements, you can deploy multiple TSs on control nodes or data nodes.

FTP-Server

FTP-Server

1 GB

FTP-Server depends on HDFS and ZooKeeper.

Each instance provides 16 concurrent channels by default. If more concurrent channels are required, you can deploy multiple instances. You are not advised to deploy FTP-Servers on control nodes or nodes where DataNodes reside. When FTP-Servers are deployed on nodes where DataNodes reside, data may be unbalanced.

Yes

Yes

Flume

Flume

1 GB

Flume depends on HDFS and ZooKeeper.

You are advised to deploy the Flume and DataNode on different nodes. If they are deployed on the same node, data imbalance may occur.

Yes

Yes

MonitorServer

128 MB

Deploy MonitorServers on two control nodes in non-active/standby mode.

Kafka

Broker

6 GB

Kafka depends on ZooKeeper.

Deploy Brokers on at least two data nodes. If the data volume generated each day exceeds 2 TB, you are advised to deploy multiple Brokers on control nodes.

Yes

Yes

KafkaUI

-

Kafka depends on Broker and ZooKeeper.

Deploy MonitorServers on two control nodes

Yes

Yes

Metadata

Metadata

512 MB

Metadata depends on DBService.

Deploy only one Metadata on one control node in each cluster.

Yes

N/A

Oozie

oozie

1 GB

Oozie depends on DBService, Yarn, HDFS, and MapReduce.

Deploy oozies on two control nodes in non-active/standby mode.

Yes

No

Solr

SolrServerN (N = 1-5)

2 GB

Solr depends on ZooKeeper.

NOTE:

When Solr data is stored on HDFS, Solr also depends on HDFS. If you select to store Solr index data on HDFS, deploy only one SolrServer instance (including SolrServerAdmin) on each node.

Each node supports five instances. You are advised to configure more than three nodes with instances evenly distributed.

  • Solr data is stored on HDFS preferentially, and three Solr instances are deployed on each node.
  • It is recommended that a node whose real-time index speed is greater than 2 MB/s be deployed on a local disk, five Solr instances be deployed on each node, and a disk be mounted to each Solr instance independently.
  • Compared with storage on a local disk, the performance of storage on HDFS decreases by 30% to 50%.

Yes

No

SolrServerAdmin

2 GB

Deploy SolrServerAdmins on two data nodes in non-active/standby mode.

HBaseIndexer

512 MB

HBaseIndex depends on HBase, HDFS, and ZooKeeper.

NOTE:

HBaseIndexer is required only when the hbase-indexer function is used.

Deploy one HBaseIndexer on each node where each SolrServer instance resides.

Elasticsearch

EsMaster

Recommended value: 31 GB

Elasticsearch depends on ZooKeeper.

Control node. Install an odd number (at least three) of EsMaster instances.

Yes

Yes

EsNode1-9

The values of -Xms and -Xmx must be consistent. Recommended value: 31 GB.

Data node. EsNode1 should be installed with at least two instances. Installation of EsNode2 to EsNode9 is optional.

EsClient

Recommended value: 31 GB

You are advised to deploy EsClients on data nodes to share the load when the cluster scale is large or when there are a large number of service requests. Determine the number of EsClients to be installed as required.

SmallFS

FGCServer

6 GB

SmallFS depends on MapReduce, Yarn, HDFS, and ZooKeeper.

Deploy FGCServers on two control nodes in active/standby mode.

No

No

Flink

FlinkResource

-

Flink depends on HDFS, Yarn, and ZooKeeper.

FlinkResource does not have an actual process and does not consume memory. Deploy FlinkResource on data nodes. The number of FlinkResources must be consistent with the number of Yarn NodeManagers.

No

Yes

Storm

Logviewer

1 GB

-

Deploy Logviewers on each node deployed with Supervisors.

Yes

Yes

Nimbus

1 GB

Nimbus depends on ZooKeeper.

Deploy Nimbuses on two control nodes in active/standby mode. This service role is associated with UI.

UI

1 GB

UI depends on ZooKeeper.

Deploy UIs on two control nodes. This service role is associated with Nimbus. (The association relationship means that on a node where Nimbus is deployed, UI is also deployed.)

Supervisor

1 GB

-

Deploy Supervisors on at least one data node. If a large number of computing capabilities are required, you are advised to deploy multiple Supervisors on independent nodes. Supervisors manage Workers, which occupy a large amount of resources. The number of Workers and memory can be configured.

NOTE:

The number of Supervisors to be deployed can be calculated using the following formula, where the number of topologies and the number of Workers in each topology are planned by the customer and the number of Workers configured for each Supervisor is five by default.

Number of Supervisors = Number of topologies x Number of Workers in each topology/Number of Workers configured for each Supervisor

Redis

Redis_1, Redis_2, Redis_3, ..., Redis_32

1 GB

Redis depends on DBService.

In single-master mode, deploy Redis on at least one data node. Deploy Redis clusters on at least three data nodes if required.

No

No

GraphBase

LoadBalancer

8 GB

GraphBase depends on HDFS, HBase, Spark, Yarn, ZooKeeper, MapReduce, Kafka, DBService, KrbServer, LdapServer, and Elasticsearch.

Deploy LoadBalancers on two control nodes in active/standby mode.

Yes

No

GraphServer

32 GB

Deploy GraphServers based on the GraphServer service volume. You can deploy GraphServers on multiple nodes or on the node where NodeManager resides for load sharing.

To reduce GraphServer resource competition, you are advised to deploy it independently.

Elk

ElkServer

16 GB

Elk depends on HDFS and Yarn.

Deploy ElkServers on at least three data nodes where the DataNode of HDFS resides.

No

No

Download
Updated: 2019-05-17

Document ID: EDOC1100074548

Views: 3172

Downloads: 36

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next