No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD V100R002C60SPC200 Product Description 06

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Oozie

Oozie

Basic Concept

Function

Oozie is an open-source workflow engine that is used to schedule and coordinate Hadoop tasks.

Architecture

The Oozie engine is a web application that is integrated into Tomcat by default. Oozie adopts PostgreSQL databases.

Oozie provides a web console based on Ext. The web console allows users to view and monitor Oozie workflows only. Oozie provides an external Representational State Transfer (REST) web service interface for the Oozie client to control workflows, such as starting and stopping workflows, and orchestrate and run Hadoop MapReduce tasks, as shown in Figure 2-69.

Figure 2-69 Oozie architecture

Table 2-19 describes the functions of each module shown in Figure 2-69.

Table 2-19 Oozie modules

Name

Description

Console Enables users to view and monitor Oozie workflows.
Client Provides interfaces for users to control workflows. On a client, a user can submit, start, run, terminate, and restore workflows.
SDK Is short for software development kit. An SDK is a set of development tools used by software engineers to establish applications for particular software packages, software frameworks, hardware platforms, and operating systems.
Database Is a PG database.
WebApp (Oozie) Functions as the Oozie server. It can be deployed on a built-in or an external Tomcat container. Information recorded by WebApp (Oozie) including logs is stored in the PG database.
Tomcat Is a free-of-charge open-source Web application server.
Hadoop components Are underlying components for executing Oozie orchestration workflows. Hadoop components include MapReduce and Hive.
Principle

Oozie is a workflow engine server that runs MapReduce workflows. Oozie is also a Java web application that runs in a Tomcat container.

Oozie workflows are constructed using Hadoop Process Definition Language (HPDL). HPDL is an XML-defined language, similar to JBoss jBPM Process Definition Language (jPDL). An Oozie workflow consists of the Control Node and Action Node.

  • Control Node is used to control the orchestration of a work flow, including start, end, error, decision, fork, and join.
  • An Oozie workflow contains multiple Action Nodes, such as MapReduce and Java.

    All Action Nodes are deployed and run in Direct Acyclic Graph (DAG) mode. Therefore, Action Nodes run in direction. That is, the next Action Node can run only when the running of the previous Action Node ends. When the current Action Node is executed, the remote server calls back the Oozie interface. Then Oozie executes the next Action Node until all Action Nodes are executed (execution failures are counted).

Oozie workflows provide various types of Action Nodes, such as MapReduce, Hadoop distributed file system (HDFS), Secure Shell (SSH), Java, and Oozie subflows, to support different business requirements.

Translation
Download
Updated: 2019-04-10

Document ID: EDOC1000104139

Views: 5982

Downloads: 64

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next