No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Product Description 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Loader

Loader

Basic Principle

Function

Function enhancements have been made for Loader based on the open-source Sqoop 1.99.x. Loader is used to exchange data and files between FusionInsight, relational databases, and the Hadoop distributed file system (HDFS). Loader can import data from relational databases or file servers to the HDFS or HBase of FusionInsight or export data from the HDFS or HBase to relational databases or file servers.

Structure

The Loader model consists of Loader client and Loader server, as shown in Figure 2-58.

Figure 2-58 Loader Architecture

Table 2-15 describes the functions of each component shown in Figure 2-58.

Table 2-15 Function description

Component

Description

Loader Client

Provides a web user interface (WebUI) and a command-line interface (CLI).

Loader Server

Processes operation requests sent from the client, manages connectors and metadata, submits MapReduce jobs, and monitors MapReduce job status.

REST API

Provides a Representational State Transfer (RESTful) interface (HTTP + JSON) to process the operation requests from the client.

Job Scheduler

Periodically executes Loader jobs.

Transform Engine

A data transformation engine that supports field combination, string cutting, and string reverse.

Execution Engine

Executes Loader jobs in MapReduce manner.

Submission Engine

Submits Loader jobs to MapReduce.

Job Manager

Manages Loader jobs, including creating, querying, updating, deleting, activating, deactivating, starting, and stopping jobs.

Metadata Repository

Stores and manages data about Loader connectors, transformation procedures, and jobs.

HA Manager

Manages the active/standby status of Loader servers. Two Loader servers are deployed in active/standby mode.

Principle

Implementing parallel job execution and fault tolerance using MapReduce

Loader implements parallel import or export jobs using MapReduce. Some import or export jobs may involve only the Map operations, and some jobs may involve both Map and Reduce operations.

Loader implements fault tolerance using MapReduce. Jobs can be rescheduled when job execution fails.

Importing data to HBase

  1. When a Map operation is performed for a MapReduce job, Loader obtains data from an external data source.
  2. When a Reduce operation is performed for a MapReduce job, Loader enables the same number of Reduce tasks based on the number of Regions. The Reduce tasks receive data from Map tasks, generate HFiles by Region, and stores the HFiles in a temporary directory of the HDFS.
  3. When a MapReduce job is submitted, Loader migrates HFiles from the temporary directory to the HBase directory.

Importing data to the HDFS

  1. When a Map operation is performed for a MapReduce job, Loader obtains data from an external data source and exports the data to a temporary directory (named export directory-ldtmp) of the HDFS.
  2. When a MapReduce job is submitted, Loader migrates data from the temporary directory to the export directory.

Exporting data to a relational database

  1. When a Map operation is performed for a MapReduce job, Loader obtains data from the HDFS or HBase and inserts the data to a temporary table (Staging Table) through the Java database connectivity (JDBC) port.
  2. When a MapReduce job is submitted, Loader migrates data from the temporary table to a formal table.

Exporting data to a file system

  1. When a Map operation is performed for a MapReduce job, Loader obtains data from the HDFS or HBase and writes the data to a temporary directory of the file server.
  2. When a MapReduce job is submitted, Loader migrates data from the temporary directory to a formal directory.

Relationship with Components

The components that interact with Loader include HDFS, HBase, Mapreduce, and Zookeeper. Loader works as a client to use certain functions of these components, such as storing data to HDFS and HBase and reading data from HDFS and HBase tables. In addition, Loader functions as a Mapreduce client to import or export data.

Download
Updated: 2019-05-17

Document ID: EDOC1100074548

Views: 3177

Downloads: 36

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next