No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Replacing Namespace in HDFS

Publication Date:  2019-04-25 Views:  21 Downloads:  0
Issue Description

If there are multiple namespaces in an HDFS cluster, the namespaces need to be adjusted. For example, if the namespaces need to be replaced or merged, data in the namespace needs to be copied and migrated, and metadata information needs to be updated.

Handling Process

1.Stop services.

2.Back up DBService data.

        Log in to the active DBService node as user omm.

        cd /opt/huawei/Bigdata/FusionInsight_Current/3_27_DBServer/install/sbin
sh dbservice_backup.sh -b

3. Snapshot: Create a snapshot on the HDFS client as a user with the supergroup permission.


The snapshot must be manually filtered to determine which files need to be copied to the original namespace.

The following directories are used as examples:

  • Hive external table data for tests

hdfs dfsadmin -allowSnapshot hdfs://NS1/federation_013

hdfs dfs -createSnapshot hdfs://NS1/federation_013 20180905sp

  • Hive specific partition data for tests

         hdfs dfsadmin -allowSnapshot hdfs://NS1/hive_test

         hdfs dfs -createSnapshot hdfs://NS1/hive_test 20180905sp

  • Hive internal table data for tests

        hdfs dfsadmin -allowSnapshot hdfs://NS1/user

        hdfs dfs -createSnapshot hdfs://NS1/user 20180905sp

4. Use Distcp to copy snapshot data to the correct NameNode.

(The migration time of a single node is 50 MB/s, and the total time can be evaluated based on it.)

  1. Check whether the target directory exists (for example, hdfs://hacluster/federation_013).
    hdfs dfs -ls hdfs://hacluster/federation_013

  • If it does, run the update command:

         hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -update -delete -prbugpcaxt          hdfs://NS1/federation_013/.snapshot/20180905sp hdfs://hacluster/federation_013

  • If it does not, run the following command:

        hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -prbugpcaxt   

        hdfs://NS1/federation_013/.snapshot/20180905sp hdfs://hacluster/federation_013

        Special processing for the tmp directory (The -i parameter needs to be added to ignore errors during the copying process. The command can be executed many times.)

    hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -update -i -prbugpcaxt     hdfs://NS1/tmp/.snapshot/20180905sp hdfs://hacluster/tmp

Example:

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -prbugpcaxt       hdfs://NS1/federation_013/.snapshot/20180905sp hdfs://hacluster/federation_013

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -prbugpcaxt hdfs://NS1/hive_test/.snapshot/20180905sp hdfs://hacluster/hive_test

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -prbugpcaxt hdfs://NS1/user/.snapshot/20180905sp hdfs://hacluster/user

5. Use Distcp to update data again.

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -update -prbugpcaxt hdfs://NS1/federation_013/ hdfs://hacluster/federation_013

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -update -prbugpcaxt hdfs://NS1/hive_test/ hdfs://hacluster/hive_test

hadoop distcp -Dmapreduce.job.queuename=root.default -m 500 -update -prbugpcaxt hdfs://NS1/user/ hdfs://hacluster/user

6. Modify the location field in the hive table.

Log in to the active node where DBService resides, switch to user omm, and run the following commands:

su - omm

source /opt/huawei/Bigdata/FusionInsight_BASE_V100R002C80SPC200/install/FusionInsight-dbservice-2.7.0/.dbservice_profile

gsql -U hive -W HiveUser@ -d hivemeta -p 20051

update DBS set DB_LOCATION_URI = replace(DB_LOCATION_URI,'hdfs://NS1/','hdfs://hacluster/');

update SDS set LOCATION = replace(LOCATION,'hdfs://NS1/','hdfs://hacluster/');

update FUNC_RU set RESOURCE_URI = replace(RESOURCE_URI,'hdfs://NS1/','hdfs://hacluster/');

7. Check the Spark table and check whether the parameters of the specified path are used.

select * from TABLE_PARAMS where PARAM_VALUE like 'hdfs://NS1/';

  • If the query result is empty, skip this step.

  • If the value is not empty, contact Huawei technical support to check whether the following statement can be executed:

update TABLE_PARAMS set PARAM_VALUE = replace(PARAM_VALUE,'hdfs://NS1/','hdfs://hacluster/');

8. Change the value of fs.defaultFS of Hive to hdfs://hacluster.

  1. On FusionInsight Manager, choose Service Management > Hive > Service Configuration > All Configurations > Hive > HDFS Client, and set the parameter to hdfs://hacluster.

  2. Save the configuration but do not restart the service.

9.Set the HDFS parameter fs.defaultFS to hdfs://hacluster.

  1. Log in to FusionInsight Manager, choose Services > HDFS > Service Configuration > All Configurations > HDFS > Default, and set the parameter to hdfs://hacluster.

  2. Save the configuration but do not restart the service.

10.Restart the services whose configurations have expired.

Log in to FusionInsight Manager, choose Services > More Actions > Synchronize Configuration, select and restart role instances whose configurations have expired.

11.Verify Hive. (nsdb is the Hive database used for testing, and test is the table used for testing.)

Use beeline to log in to Hive.

use nsdb;

show create table test;

select * from table test;

describe database nsdb;

show functions;

describe function default.udf_trans2 ;

12.Delete snapshots.

hdfs dfs -deleteSnapshot hdfs://NS1/federation_013 20180905sp

hdfs dfs -deleteSnapshot hdfs://NS1/hive_test 20180905sp

hdfs dfs -deleteSnapshot hdfs://NS1/user 20180905sp

13.Perform the rollback.

update DBS set DB_LOCATION_URI = replace(DB_LOCATION_URI,'hdfs://hacluster/','hdfs://NS1/');

update SDS set LOCATION = replace(LOCATION,'hdfs://hacluster/','hdfs://NS1/');

update FUNC_RU set RESOURCE_URI = replace(RESOURCE_URI,'hdfs://hacluster/','hdfs://NS1/');

update TABLE_PARAMS set PARAM_VALUE = replace(PARAM_VALUE,'hdfs://hacluster/','hdfs://NS1/');

If the command execution is incorrect, run the following command to restore the backup package using the backup generated in Step 2:

sh dbservice_backup.sh -r ${backup_path}

END