No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Administrator Guide 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Backing Up HBase or OMS Data to a Third-Party Server

Backing Up HBase or OMS Data to a Third-Party Server

Scenario

FusionInsight HD supports backing up HBase data and OMS data (including OM, LDAP, and DBService data) in a cluster that employs the security mode to a third-party server outside the cluster. This improves system reliability.

Secure File Transfer Protocol (SFTP) is used to back up data to a third-party server. Obtain one or multiple third-party servers that run Linux OSs (if multiple servers are configured, backup data can be stored among the servers), install and configure the third-party servers, and enable backup transmission scripts on one third-party server. Then HBase or OMS data in the FusionInsight HD cluster is backed up to the third-party server. After data is successfully backed up, HBase data is stored in the third-party server (if multiple servers exist, part of data is stored on each server). The backup data storage path is set when the Loader job is configured. OMS data is stored under the specified backup path. You can copy the data to the directory on the active management node during restoration.

Prerequisites

  • The Loader service must be installed in the cluster and is running properly.
  • You have contacted the system administrator and obtained the address for accessing the Loader WebUI, for example https://10.10.0.252:20026/Loader/LoaderServer/175/loader/index.html#/home.
  • You have obtained user hbase and the password for accessing the Loader WebUI. Change the password upon the first login.
  • When a user creates anHBase table, KEEP_DELETED_CELLS is set to false by default. When the user backs up this HBase table, deleted data will be backed up and junk data may exist after data restoration. Based on service requirements, this parameter needs to be set to true manually when an HBase table is created.
  • When a user manually specifies the time stamp when writing data into an HBase table and the specified time is earlier than the last backup time of the HBase table, new data may not be backed up in incremental backup tasks.
NOTE:

The preceding prerequisites are not required if only OMS data needs to be backed up.

Procedure

Add the third-party server to the FusionInsight HD cluster.

  1. Log in to the FusionInsight Manager system.
  2. Click Host > Add, and add three third-party servers to the FusionInsight HD cluster guided by the wizard.

    Install the FusionInsight HD client on third-party servers, and configure HBase security authentication.

  3. Install the FusionInsight HD client on the third-party server. For example, the client installation directory is /opt/hadoop-client. For detailed operations, see Installing a Client.
  4. To back up OMS data only, go to Step 18. To back up both OMS data and HBase data, go to Step 5.
  5. Add the following information in the <configuration></configuration> structure to client _installation_directory/HBase/hbase/conf/hbase-site.xml. Modify parameters based on comments.

    <!--Specifies the user name (that is, Kerberos account) for configuring the hbase.keytab file.--> 
    <property> 
    <name>username.client.kerberos.principal</name> 
    <value>hbase/hadoop.hadoop.com@HADOOP.COM</value> 
    </property>

    On the cluster node where RegionServer is installed, go to cluster _installation_directory/FusionInsight_HD_x.x.x/x_xx_RegionServer/etc/hbase-site.xml to check the value of hbase.regionserver.kerberos.principal and use the value as <value>.

    <!--Specifies the path where the hbase.keytab file is stored.--> 
    <property> 
    <name>username.client.keytab.file</name> 
    <value>hbase.keytab save path</value> 
    </property>

    On the cluster node where RegionServer is installed, go to cluster _installation_directory/FusionInsight_HD_x.x.x/x_xx_RegionServer/etc/ to obtain the keytab file of a specified user, and save the file to the path on a third-party server as <value>. For example, /opt/hadoop-config/hbase.keytab.

    <!--Specifies the script path of the file transfer tool Loader.--> 
    <property> 
    <name>hbase.backupandtransfer.etl.script</name> 
    <value>Client installation path/Loader/loader-tools-1.99.3/shell-client/submit_job.sh</value> 
    </property>

    After installing the client, you need to decompress loader-tools-1.99.3.tar of Loader folder and enter the path of submit_job.sh. The script is saved in shell-client of the decompression folder.

    <!--Specifies whether to compress the output result.--> 
    <property> 
    <name>mapred.output.compress</name> 
    <value>true</value> 
    </property> 
     
    <!--Specifies the compression coding mode.--> 
    <property> 
    <name>mapred.output.compression.codec</name> 
    <value>Encoding mode used by backup data</value> 
    </property>

    The following compression coding modes can be set. GzipCodec provides a high compression rate.

    • org.apache.hadoop.io.compress.SnappyCodec
    • org.apache.hadoop.io.compress.GzipCodec
    • org.apache.hadoop.io.compress.BZip2Codec
    <property> 
    <name>hbase.backupandtransfer.etl.output</name> 
    <value>Backup data save path</value> 
    </property>

    The third-party server file path in which the HBase data is backed up must be the same as the third-party server file path configured in the Loader job. The backup path needs to be created manually. The owner of the path must be set to omm, and read and write permission must be assigned to user omm.

    <property> 
    <name>hbase.backupandtransfer.etl.jobname</name> 
    <value>Loader job name</value> 
    </property>

    Name of the Loader job for backing up the HBase data.

    <property> 
    <name>java.security.auth.login.config</name> 
    <value>Save path of the HBase jaas.conf configuration file</value> 
    </property>

    On the cluster node where RegionServer is installed, go to cluster _installation_directory/FusionInsight_HD_x.x.x/x_xx_RegionServer/etc/ to obtain the jaas.conf file, and save the file to the path on a third-party server as <value>. Set keyTab to hbase.keytab save path, and set principal to hbase/hadoop.hadoop.com@HADOOP.COM in jaas.conf.

    <property> 
    <name>java.security.krb5.conf</name> 
    <value>Save path of the krb5.conf file</value> 
    </property>

    On any cluster node, go to cluster _installation_directory/FusionInsight_BASE_x.x.x/x_xx_KerberosClient/etc/ to obtain the krb5.conf file, and save the file to the path on a third-party server as <value>.

    NOTE:

    To back up data to a third-party server in the HBase multi-instance scenario, you need to modify the configuration file in the HBase service instance directory on the third-party server. For example, you need to modifyclient _installation_directory/HBase2/hbase/conf/hbase-site.xml for HBase2.

  6. Repeat Step 3 to Step 5 on the other two third-party servers.

    Create a data export job on the Loader UI.

  7. Enter the address for accessing the Loader WebUI and log in to the Loader WebUI as user hbase.

    For details about the address for Loader WebUI, contact the system administrator.

  8. Click New Job, and set the job property on the job configuration page, as shown in Figure 13-1.

    1. Set Name to the job name.
    2. Set Job type to Export.
    3. Click Add on the right of Connection to create a connection, as shown in Figure 13-2.
    4. Click OK. The new connection is created.
    5. Click Add next to Group, and create a Loader job group for saving the current Loader jobs. For example, in Name, enter hbase_loader, and click OK.
    6. Set Queue to the default Yarn queue default.
    7. Set Priority to NORMAL.
    8. Click Next.

      The job property is set.

    Figure 13-1 Setting the job type

    Figure 13-2 Creating a connection between Loader and data source

  9. Set export information, and click Next, as shown in Figure 13-3.

    • Set Source type to HDFS.
    • Input directory: indicates the HDFS directory to be exported. This parameter value is changed during backup job implementation, and the set HDFS directory does not take effect. You can set this parameter to any HDFS directory.
    • Set Path filter to * to import all paths.
    • Set File filter to * to import all files.
    • Set File type to the BINARY_FILE file input type.
    • Set File split type to the file split mode, including FILE and SIZE. FILE indicates splitting by file and SIZE indicates splitting by size. Files are split by the specified mode and become Map input files.
    • Set Extractor to the number of Map jobs for executing MapReduce jobs. Extractors is set to 10. Extractor and Extractor size cannot be set at the same time.
    Figure 13-3 Setting the export file

  10. On the 3.Transform page, configure data transformation and click Next.
  11. On the 4.To tab, set Output path, File operate type, Encode type and Compression, and click Save to save the job information, as shown in Figure 13-4.

    • Output path: indicates the path for exporting files to the SFTP server. The exported file is named by the export time and stored in /opt/huawei/sftp. Therefore, set Output path to the lower level of /opt/huawei/sftp, for example, /opt/huawei/sftp/file1. If HBase data is to be backed up to multiple servers, configure the output path for each server. Separate paths using semicolons (;).
    NOTE:

    Please select an empty directory for the output path.

    • File operate type: specifies the action when duplicate file names exist during the export. Set this parameter to ERROR.
    • Encode type specifies the encoding format of an exported file. You do not need to set this parameter when backing up the HBase data to a third-party server.
    • Compression: specifies whether to compress data during transmission. Set this parameter to false.
    Figure 13-4 Saving job information

  12. Click Save to save the job information.
  13. On the Loader home page, view the job and records the job ID.

Configure the Loader client.

  1. Use PuTTY to log in to the third-party server as the client installation user. Switch to the FusionInsight HD client installation directory, and run the following command to go to the configuration file directory of the Loader client:

    cd /opt/hadoop-client/Loader/loader-tools-1.99.3/loader-tool/job-config

  2. Run the following command to modify the configuration file:

    vi login-info.xml

    Modify the following parameters:

    <hadoop.config.path>/opt/hadoop-client/Loader/loader-tools-1.99.3/loader-tool/hadoop-config/</hadoop.config.path> 
    <authentication.principal>hbase/hadoop.hadoop.com</authentication.principal> 
    <authentication.keytab>/opt/hadoop-config/hbase.keytab</authentication.keytab> 
    <zookeeper.quorum>127.0.0.1:24002,127.0.0.2:24002,127.0.0.3:24002</zookeeper.quorum> 
    <sqoop.server.list>127.0.0.4:21351</sqoop.server.list>

    Parameters are described as follows:

    • hadoop.config.path: specifies the directory for storing the core-site.xml, hdfs-site.xml, and krb5.conf files that are required to access the Loader service. These files are stored in the Loader/loader-tools-1.99.3/loader-tool/hadoop-config directory of the client.
    • authentication.keytab: the path of HBase keytab file. For example, /opt/hadoop-config/hbase.keytab.
    • authentication.principal: indicates the user principal for accessing the Loader service. Set the parameter to hbase/hadoop.hadoop.com.
    • zookeeper.quorum: indicates the IP address and port of the ZooKeeper service. If there are multiple addresses and ports, use comas (,) to separate them.
    • sqoop.server.list: indicates the floating IP address and port of the Loader service.

  1. Run the following command to modify the backup configuration file:

    vi /opt/hadoop-client/Loader/loader-tools-1.99.3/loader-backup/conf/backup.properties

    Modify the following parameters:

    server.url = 127.0.0.1:21351,127.0.0.2:21351 
    authentication.type = kerberos 
    authentication.user =  
    authentication.password=  
    job.jobId = 1 
    use.keytab = true 
    client.principal = hbase/hadoop.hadoop.com 
    client.keytab = /opt/hadoop-config/hbase.keytab

    The parameters are described as follows:

    • server.url: indicates the floating IP address and port of the Loader service. Set the first IP address and port. The second IP address and port remain unchanged.
    • authentication.type: indicates the user authentication mode for accessing Loader. The default value is kerberos. The password authentication mode is not recommended.
    • uthentication.user: indicates the username in password authentication mode. This parameter does not need to be configured in this task.
    • uthentication.password: indicates the user password in password authentication mode. This parameter does not need to be configured in this task.
    • job.jobId: identifies a created Loader job.
    • use.keytab: If this parameter is set to true, a user keytab file is used for security authentication.
    • client.principal: indicates the principal of the user for accessing the Loader service. Set the parameter to hbase/hadoop.hadoop.com.
    • client.keytab: the path of HBase keytab file. For example, /opt/hadoop-config/hbase.keytab.

  2. If HBase data is to be backed up to multiple servers, repeat Step 14 to Step 15 on each third-party server.

Back up data to the third-party server.

  1. Use PuTTY to log in to one third-party server as user root.

    NOTE:

    To back up data to one or multiple servers, you need to perform operations only on one third-party server. After scripts are enabled, the system automatically transmits data to each server based on configurations.

  1. Run the following command to go to the directory where the backup transmission scripts are stored:

    cd ${BIGDATA_HOME}/om-agent/nodeagent/tools

  2. Run the following command to invoke the backup transmission scripts:

    ./backupAndTransfer.sh installation directory of the FusionInsight client backup mode target directory where data is backed up backup data type HBase service instance ID

    • Two HBase backup modes are supported:
      • full: full backup
      • inc: incremental backup
    • Two backup data types are supported. If this parameter is left empty, HBase and OMS data is backed up.
      • hbase: Backs up HBase data.
      • om: Backs up OMS data.
      • If the multi-instance function is enabled and the data of other HBase service instances needs to be backed up to a third-party server, you need to add the HBase service instance ID parameter. If this parameter is not added, the data of only the HBase instance is backed up by default.

        For example, back up HBase data to the third-party server in full backup mode:

        ./backupAndTransfer.sh /opt/hadoop-client full /opt/nbu/backup/result hbase

        For example, back up OMS data to the third-party server in full backup mode:

        ./backupAndTransfer.sh /opt/hadoop-client full /opt/nbu/backup/result om

    NOTE:
    • OMS backup files consist of OMS, LDAP, DBService, and HDFS-hacluster-fsimage backup files.
    • User omm has the write permission for the target directory of backup data.

    For example, back up HBase2 data to the third-party server in full backup mode:

    ./backupAndTransfer.sh /opt/hadoop-client full /opt/nbu/backup/result hbase 2

    For example, back up HBase data (in full backup mode) and OMS data to the third-party server:

    ./backupAndTransfer.sh /opt/hadoop-client full /opt/nbu/backup/result

    For example, back up HBase2 data (in full backup mode) and OMS data to the third-party server:

    ./backupAndTransfer.sh /opt/hadoop-client full /opt/nbu/backup/result 2

    The following information is displayed if the command is run successfully:

    Data backup succeeded.

Download
Updated: 2019-05-17

Document ID: EDOC1100074522

Views: 6138

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next