No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search


To have a better experience, please upgrade your IE browser.


FusionInsight HD V100R002C60SPC200 Product Description 06

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Logical Architecture

Logical Architecture

Figure 2-1 shows the logical architecture of FusionInsight HD.

Figure 2-1 FusionInsight HD logical architecture

FusionInsight HD packages and enhanced open-source components, including Manager and various components, provide the following functions respectively:

  • Manager

    As the operation and maintenance (O&M) system, Manager provides cluster management capabilities, featuring reliability, security, fault tolerance, and easy-to-use for FusionInsight HD. It supports large-scale cluster installation, monitoring, alarm reporting, user management, rights management, auditing, service management, health check, troubleshooting, upgrading, and patch installation.

  • Hue

    Hue provides a graphical web user interface (WebUI) for FusionInsight HD applications. Hue supports components including Hadoop distributed file system (HDFS), YARN, and Hive.

  • Loader

    Loader is a tool used to exchange data and files between FusionInsight HD, relational databases, and the HDFS. It provides a Representational State Transfer (REST) application programming interface (API) for third-party scheduling platforms.

  • Flume

    Flume is a distributed, highly reliable, and HA massive log aggregation system. Flume supports customized data transmitters for collecting data. Flume also roughly processes data and writes data to customizable data receivers.

  • FTP-Server

    FTP-Server enables users to perform basic operations, such as file uploading and downloading, directory query, creation, and deletion, and file access permission modification on the HDFS through the FTP client and using the transfer protocol.

  • Hive

    Hive is an open-source data warehouse built on Hadoop. It stores structured data and provides basic data analysis services using the Hive query language (HQL), a language like the structured query language (SQL).

  • MapReduce

    MapReduce is a distributed data processing mode and execution environment. It accelerates concurrent processing of massive data.

  • Streaming

    Streaming provides a distributed, high-performance, reliable, and fault-tolerant computing platform for real-time processing of massive data. It uses Continuous Query Language (CQL), an SQL-like stream processing language, to implement rapid development and rollout of services.

  • Spark

    Spark is a distributed computing architecture that implements memory-based computing.

  • Solr

    Solr is a high-performance and Lucene-based full text search server. Solr incorporates more search languages based on Lucene and optimizes the search performance. In addition, Solr is configurable and extendable and provides a complete function management user interface (UI).

  • Oozie

    Oozie orchestrates and executes workflows of open-source Hadoop components. Oozie runs in a Java Servlet container (such as Tomcat) as a Java web application. Oozie stores workflow definitions and the running workflow instances (including instance status and variables) in databases.

  • Redis

    Redis is an open-source and high-performance key-value distributed storage database. It supports a variety of data types, supplementing the key-value storage such as memcached and meeting the real-time and high-concurrency requirements.

  • Kafka

    Kafka is a distributed, partitioned, and replicated real-time message publishing and subscription system. It provides scalable, high-throughput, low-latency, and reliable message dispatching services.

  • Yarn

    Yarn is the resource management system. It is a general resource module and implements resource management and scheduling for various kinds of applications.

  • HDFS

    The HDFS enables high-throughput data access and applies to processing of large data sets.

  • SmallFS

    SmallFS provides a background small file merging function. SmallFS automatically detects small files in the system based on the file size threshold, merges them when being idle, and stores metadata to a local LevelDB to reduce NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.

  • DBService

    A high-reliability traditional relationship database, providing the metadata storage service for Hive, Hue, Spark components.

  • HBase

    HBase is a distributed, column-oriented storage system built on the Hadoop Distributed File System (HDFS). It stores massive data.

  • ZooKeeper

    ZooKeeper enables highly reliable distributed coordination. It helps prevent single point of failures (SPOFs) and provides reliable services for applications.

Updated: 2019-04-10

Document ID: EDOC1000104139

Views: 5899

Downloads: 64

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Previous Next