No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search


To have a better experience, please upgrade your IE browser.


Importing Elk Data in Batches Using GDS

Publication Date:  2019-04-12 Views:  205 Downloads:  0

Issue Description

How do I use GDS to import Elk table data in batches?


1. Log in to the Elk cluster.

su – omm // Switch to user omm.

source /opt/huawei/Bigdata/mppdb/. mppdbgs_profile

gsql –d postgres –p 25108 –r

2. (Optional) Create a user.

create user user_sys WITH sysadmin identified by "Huawei@123"; // Create user user_sys with the admin permission.

set user user_sys password 'Huawei@123'; // Log in to the system as user user_sys.

3. (Optional) Create a tablespace.

a. Log in to each Elk node as user root to create an empty directory, and set the owner to omm:wheel.

mkdir –p /mpp/bigdata/hdfs_test

chown omm:wheel –R /mpp/

b. Create a tablespace hdfs_test in the /mpp/bigdata/hdfs_test directory with parameters set as follows:

filesystem set to hdfs

cfgpath set to /opt/huawei/Bigdata/mppdb/conf

storepath set to /mppdb_data/distribute

hdfs_test is the newly created HDFS tablespace.

/mpp/bigdata/hdfs_test is an empty directory on which user omm has the read and write permissions.

/opt/huawei/Bigdata/mppdb/conf is the path of the HDFS cluster configuration file.

/mppdb_data /distribute is the directory for storing data on HDFS.

Note: Create a tablespace directory first. Otherwise, the Elk cluster is faulty.

4. Create an internal table.

create table test (id int, name varchar(20)) tablespace hdfs_test;

5. Log in to the data source server, create the local file /input_data, and import the data file data.txt into the folder.

su -root

mkdir -p /input_data // Directory for storing original files.

chown -R omm:wheel /input_data

6. (Optional) Install the GDS.

a. Decompress the Elk component installation package.

mkdir -p /opt/bin // Create the directory for installing the GDS service.

cd \FusionInsight_MPPDB\software\components\package\FusionInsight-MPPDB-x.x.x.tar.gz\package

tar –zxvf Gauss-MPPDB-ALL-PACKAGES.tar.gz // Decompress and select the installation package of the required version.

cp Gauss200-OLAP-V100R006C10-SUSE-64bit-Gds.tar.gz /opt/bin

groupadd wheel  // Create a database user group.

useradd -g wheel omm // Create a database user

chown -R omm:wheel /opt/bin/gds

7. Start the GDS.

su -omm

/opt/bin/gds/gds -d /input_data -p -H –D

/opt/bin/gds: GDS installation path.

/Input_data/: directory that stores the data files. IP address of the data server. IP address of the host that is allowed to connect to the GDS. Set it to the IP address of the Elk database node that is being used.

5000: GDS listening port, which can be changed.

8. Create a foreign table.

create foreign table foreign_test (id int, name varchar(20)) SERVER gsmpp_server OPTIONS (location 'gsfs://*', format 'text',mode 'normal', encoding 'utf8', delimiter ' ', null '',fill_missing_fields 'false')LOG INTO err_HR_areaS PER NODE REJECT LIMIT 'unlimited';

9. Import data through associated foreign tables.

insert into test select * from foreign_test;

Instructions: When importing data, enable the GDS service. After the import is complete, stop the service.

Start the GDS.

python start

Stop the GDS started using the configuration file.

python stop

Stop all GDS services that the current user has the permission to close.

python stop all

Stop the GDS services specified by [ip:]port that the current user has the permission to close.

python stop

Query the GDS status.

python status

cm_ctl stop -m i // Close the Elk cluster.

cm_ctl start // Restart the Elk cluster.