No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Software Installation 02

Rate and give feedback:
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
Configuring a Third-party Interface (over Third-party File Systems)

Configuring a Third-party Interface (over Third-party File Systems)

Scenario

Use Hadoop to integrate with a third-party file system. Hadoop has a comprehensive file system abstraction that provides high scalability. In addition to the Hadoop Distributed File System (HDFS), Hadoop can integrate with other file systems. Some plug-ins and modules on Hadoop enables it to integrate with third-party file systems.

Procedure

High scalability of Hadoop is achieved through the ServiceLoader mode. This mode is realized using the Service Loader class. The ServiceLoader class searches for service providers on an application's class path or in the runtime environment's extensions directory. It loads them and enables the application to use the provider's APIs. If you add new providers to the class path or runtime extension directory, the ServiceLoader class finds them. If the application knows the provider interface, it can find and use different implementations of that interface.

A JAR file is created to store the provider information. To register the service provider, you need to create a provider configuration file, which is stored in the META-INF/services directory of the service provider's JAR file. The configuration file has the following requirements:

  • The name of the configuration file is the fully qualified binary name of the service provider, which is the fully qualified class name.
  • Each component of the name is separated by a period (.), and nested classes are separated by a dollar sign ($).
  • The file must be UTF-8 encoded.
  • Begin the comment line with the number sign (#) if you want to include comments in the file.

Example: To implement a new file system, implement the org.apache.hadoop.fs.FileSystem service. To register the new file system to the HDFS, create a META-INF/services/org.apache.hadoop.fs.FileSystem file and list the class name of the implemented file system in a separate line of the file. The following shows a configuration file in the hadoop-hdfs class.

# Licensed to the Apache Software Foundation (ASF) under one or more 
# contributor license agreements.  See the NOTICE file distributed with 
# this work for additional information regarding copyright ownership. 
# The ASF licenses this file to You under the Apache License, Version 2.0 
# (the "License"); you may not use this file except in compliance with 
# the License.  You may obtain a copy of the License at 
# 
#     http://www.apache.org/licenses/LICENSE-2.0 
# 
# Unless required by applicable law or agreed to in writing, software 
# distributed under the License is distributed on an "AS IS" BASIS, 
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
# See the License for the specific language governing permissions and 
# limitations under the License. 

org.apache.hadoop.hdfs.DistributedFileSystem 
org.apache.hadoop.hdfs.web.HftpFileSystem 
org.apache.hadoop.hdfs.web.HsftpFileSystem 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem 
org.apache.hadoop.hdfs.web.SWebHdfsFileSystem

After the new file system is implemented, create a JAR file and create a configuration file with the name org.apache.hadoop.fs.FileSystem in the META-INF/services directory of the JAR file. Add detailed implementation class of the new file system to the configuration file and copy this JAR file to the class path of Hadoop. In this way, the new file system is available.

Reference

Hadoop defines an abstract file system, that is, Hadoop defines a Java abstract class org.apache.hadoop.fs.FileSystm. This abstract class is used to define a file system interface in Hadoop. If a file system implements the interface, the file system is supported by Hadoop. Table 5-6 describes file systems that have implemented Hadoop abstract classes.

Table 5-6 File systems integrated by Hadoop

File System

URI

Implementation Class (org.apache.hadoop.)

Definition

Local

file

org.apache.hadoop.fs.LocalFileSystem

A local file system that supports client verification. This file system is implemented in the fs.RawLocalFileSystem class.

ViewFs

viewfs

org.apache.hadoop.fs.viewfs.ViewFileSystem

A file system that enables the management of multiple HDFS namespaces. This system is useful when the HDFS cluster has multiple NameNodes (namespaces).

S3

s3

org.apache.hadoop.fs.s3.S3FileSystem

A block-based file system supported by Amazon S3

S3 (local)

s3n

org.apache.hadoop.fs.s3native.NativeS3FileSystem

A local file system for reading and writing files stored on Amazon S3

FTP

ftp

org.apache.hadoop.fs.ftp.FTPFileSystem

A file system supported by the FTP server.

HAR

har

org.apache.hadoop.fs.HarFileSystem

A file system created based on Hadoop file systems and is used for archiving files. Hadoop archives files to reduce the NameNode memory usage.

HDFS

hdfs

org.apache.hadoop.hdfs.DistributedFileSystem

The distributed file system of Hadoop

HFTP

hftp

org.apache.hadoop.hdfs.HftpFileSystem

A file system that allows HDFS to be accessed in read-only mode over HTTP. DistCp often replicates data between different HDFS clusters.

HSFTP

hsftp

org.apache.hadoop.hdfs.HsftpFileSystem

A file system that allows HDFS to be accessed in read-only mode over HTTPS

A set of general interfaces must be implemented if you need to integrate file systems in Hadoop. Table 5-7 describes the required interfaces.

Table 5-7 General interfaces for file systems

Interface

Description

FileSystem.open

FileSystem.create

FileSystem.append

Opens a file.

FSDataInputStream.read

Reads data from a file.

FSDataOutputStream.write

Writes data to a file.

FSDataInputStream.close

FSDataOutputStream.close

Closes a file.

FSDataInputStream.seek

Changes the read or write position of a file.

FileSystem.getFileStatus

FileSystem.get*

Obtains the attributes of a file or directory.

FileSystem.set*

Changes the attributes of a file.

FileSystem.createNewFile

Creates a file.

FileSystem.delete

Deletes a file from the file system.

FileSystem.rename

Changes the name of a file or directory.

FileSystem.mkdirs

Creates a subdirectory in a specified directory.

FileSystem.delete

Deletes an empty subdirectory from a specified directory.

FileSystem.listStatus

Reads items in a directory.

FileSystem.getWorkingDirectory

Returns to the current working directory.

FileSystem.setWorkingDirectory

Changes the current working directory.

NOTE:

The FSDataInputStream and FSDataOutputStream classes inherit the java.io.DataInputStream and java.io.DataOutputStream classes. The new file systems must also implement their own input and output streams.

Download
Updated: 2019-05-17

Document ID: EDOC1100074555

Views: 8116

Downloads: 7

Average rating:
This Document Applies to these Products

Related Version

Related Documents

Share
Previous Next