No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search


To have a better experience, please upgrade your IE browser.


FusionInsight HD 6.5.0 Product Description 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Logical Architecture

Logical Architecture

Figure 2-1 shows the logical architecture of FusionInsight HD.

Figure 2-1 FusionInsight HD logical architecture

FusionInsight HD packages and enhanced open-source components, including Manager and various components, provide the following functions respectively:

  • DBService

    A high-reliability traditional relationship database, providing the metadata storage service for Hive, Hue, Spark components.

  • Elasticsearch

    Elasticsearch is a distributed open source system based on JAVA/Lucene. It integrates the search engine and NoSQL database functions, and supports RESTful requests.

  • Flink

    Flink is a unified computing framework that supports both batch processing and stream processing. It provides a stream data processing engine that supports data distribution and parallel computing.

  • Flume

    Flume is a distributed, highly reliable, and HA massive log aggregation system. Flume supports customized data transmitters for collecting data. Flume also roughly processes data and writes data to customizable data receivers.

  • FTP-Server

    FTP-Server enables users to perform basic operations, such as file uploading and downloading, directory query, creation, and deletion, and file access permission modification on the HDFS through the FTP client and using the transfer protocol.

  • GraphBase

    GraphBase is a distributed graph database based on HBase and Elasticsearch. It builds a property graph model for storage and provides powerful graph query, analysis, and traversal capabilities.

  • HBase

    HBase is a distributed, column-oriented storage system built on the Hadoop Distributed File System (HDFS). It stores massive data.

  • HDFS

    The HDFS enables high-throughput data access and applies to processing of large data sets.

  • Hive

    Hive is an open-source data warehouse built on Hadoop. It stores structured data and provides basic data analysis services using the Hive query language (HQL), a language like the structured query language (SQL).

  • Hue

    Hue provides a graphical web user interface (WebUI) for FusionInsight HD applications. Hue supports components including Hadoop distributed file system (HDFS), Hive, Yarn/MapReduce, Oozie, Solr, and ZooKeeper.

  • Kafka

    Kafka is a distributed, partitioned, and replicated real-time message publishing and subscription system. It provides scalable, high-throughput, low-latency, and reliable message dispatching services.

  • Loader

    Function enhancements have been made for Loader based on the open-source Sqoop. Loader is a tool used to exchange data and files between FusionInsight HD, relational databases, and the HDFS. It provides a Representational State Transfer (REST) application programming interface (API) for third-party scheduling platforms.

  • Manager

    As the operation and maintenance (O&M) system, Manager provides cluster management capabilities, featuring reliability, security, fault tolerance, and easy-to-use for FusionInsight HD. It supports large-scale cluster installation, monitoring, alarm reporting, user management, rights management, auditing, service management, health check, troubleshooting, upgrading, and patch installation.

  • Mapreduce

    MapReduce is a distributed data processing mode and execution environment. It accelerates concurrent processing of massive data.

  • Metadata

    The Metadata component can extract the metadata of data warehouse components Hive and HBase. It allows labels to be manually set for each metadata for data analysis and search.

  • Oozie

    Oozie orchestrates and executes workflows of open-source Hadoop components. Oozie runs in a Java Servlet container (such as Tomcat) as a Java web application. Oozie stores workflow definitions and the running workflow instances (including instance status and variables) in databases.

  • Redis

    Redis is an open-source and high-performance key-value distributed storage database. It supports a variety of data types, supplementing the key-value storage such as memcached and meeting the real-time and high-concurrency requirements.

  • SmallFS

    SmallFS provides a background small file merging function. SmallFS automatically detects small files in the system based on the file size threshold, merges them when being idle, and stores metadata to a local LevelDB to reduce NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.

  • Solr

    Solr is a high-performance and Lucene-based full text search server. Solr incorporates more search languages based on Lucene and optimizes the search performance. In addition, Solr is configurable and extendable and provides a complete function management user interface (UI).

  • Spark

    Spark is a distributed computing architecture that implements memory-based computing.

    To ensure proper upgrade of subsequent versions, you are advised to migrate service applications on Spark components to Spark2x.

  • Spark2x

    Spark is a distributed computing architecture that implements memory-based computing.

  • Storm

    Storm provides a distributed, high-performance, reliable, and fault-tolerant computing platform for real-time processing of massive data. It uses Continuous Query Language (CQL), an SQL-like stream processing language, to implement rapid development and rollout of services.

  • Yarn

    Yarn is the resource management system. It is a general resource module and implements resource management and scheduling for various kinds of applications.

  • ZooKeeper

    ZooKeeper enables highly reliable distributed coordination. It helps prevent single point of failures (SPOFs) and provides reliable services for applications.

Updated: 2019-05-17

Document ID: EDOC1100074548

Views: 3106

Downloads: 36

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Previous Next