No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

FusionInsight HD 6.5.0 Administrator Guide 02

Rate and give feedback :
Huawei uses machine translation combined with human proofreading to translate this document to different languages in order to help you better understand the content of this document. Note: Even the most advanced machine translation cannot match the quality of professional translators. Huawei shall not bear any responsibility for translation accuracy and it is recommended that you refer to the English document (a link for which has been provided).
High-Risk Operations Overview

High-Risk Operations Overview

Table 15-1 lists forbidden operations during FusionInsight HD operation and maintenance.

Table 15-1 Forbidden operations

Category

Risk

Delete ZooKeeper data directories.

HDFS, Yarn, HBase, and Hive depend on ZooKeeper, which stores metadata. This operation has adverse impact on normal operating of related components.

Delete Elasticsearch indexes associated with GraphBase.

The external index data of GraphBase is stored in Elasticsearch. Deleting Elasticsearch indexes through a non-GraphBase interface causes data inconsistency.

Delete HBase tables associated with GraphBase.

GraphBase data is stored in HBase. This operation will cause data loss.

Switch between active and standby JDBCServer nodes frequently.

This operation may interrupt services.

Delete Phoenix system tables and data (SYSTEM.CATALOG, SYSTEM.STATS, SYSTEM.SEQUENCE, and SYSTEM. FUNCTION).

This operation will cause service operation failures.

Manually modify Hive metadatabase data (hivemeta).

This operation may cause Hive data parse errors. As a result, Hive cannot provide services.

Change permission on the Hive private file directory hdfs:///tmp/hive-scratch.

This operation may cause unavailable Hive services.

Change broker.id in the Kafka configuration file.

This operation may cause invalid node data.

Modify the host name of nodes.

Instances and upper-layer components on the host cannot provide services properly. The fault cannot be rectified.

Table 15-2 lists high-risk operations during  FusionInsight HDoperation and maintenance.

Table 15-2 High-risk operations

Category

Operation

Risk

Risk Level

Workaround

Check Item

Manager

Change the OMS password.

This operation will restart all processes of OMSServer, which has adverse impact on cluster maintenance and management.

▲▲▲

Before the change, confirm that the operation is needed. Ensure that there is no other maintenance and management operations when the operation is performed.

Check whether all alarms are cleared and whether cluster maintenance and management operations are normal.

Import the certificate.

This operation will restart OMS processes and the entire cluster, which has adverse impact on cluster maintenance and management and services.

▲▲▲

Before the change, confirm that the operation is needed. Ensure that there is no other maintenance and management operations when the operation is performed.

Check whether all alarms are cleared, whether cluster maintenance and management operations are normal, and whether services are normal.

Perform an upgrade.

This operation will restart Manager and the entire cluster, which has adverse impact on cluster maintenance and management and services.

Strictly manage the user who is eligible to assign the cluster management permission to prevent security risks.

▲▲▲

Ensure that there is no other maintenance and management operations when the operation is performed.

Check whether all alarms are cleared, whether cluster maintenance and management operations are normal, and whether services are normal.

Install a patch.

This operation will restart Manager and the entire cluster, which has adverse impact on cluster maintenance and management and services.

Strictly manage the user who is eligible to assign the cluster management permission to prevent security risks.

▲▲▲

Ensure that there is no other maintenance and management operations when the operation is performed.

Check whether all alarms are cleared, whether cluster maintenance and management operations are normal, and whether services are normal.

Restore the OMS.

This operation will restart Manager and the entire cluster, which has adverse impact on cluster maintenance and management and services.

▲▲▲

Before the operation, confirm that the operation is needed. Ensure that there is no other maintenance and management operations when the operation is performed.

Check whether all alarms are cleared, whether cluster maintenance and management operations are normal, and whether services are normal.

Change an IP address.

This operation will restart Manager and the entire cluster, which has adverse impact on cluster maintenance and management and services.

▲▲▲

Ensure that there is no other maintenance and management operations when the operation is performed and that the new IP address is correct.

Check whether all alarms are cleared, whether cluster maintenance and management operations are normal, and whether services are normal.

Change log levels.

If the log level is changed to DEBUG, Manager responds slowly.

▲▲

Before the change, confirm that the operation is needed. Change the level value to the default value.

None

Replacing a Control Node

This operation will interrupt services deployed on the node. If the node also serves as a management node, the operation will restart all OMS processes, affecting the cluster management and maintenance.

▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether uncleared alarms exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Replacing a management node

This operation will interrupt services deployed on the node. As a result, OMS processes will be restarted, affecting the cluster management and maintenance.

▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether uncleared alarms exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Selecting Restart upper-layer services. during the restart of a lower-layer service

This operation will interrupt the upper-layer service, affecting the management, maintenance, and services of the cluster.

▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether uncleared alarms exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Modifying the OLDAP port

This operation will restart the LdapServer and Kerberos services and all associated services, affecting service running.

▲▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

N/A

User delete the supergroup group.

Deleting the supergroup group decreases user rights, affecting service access.

▲▲▲▲▲

Before the change, confirm the rights to be added. Ensure that the required rights have been added before deleting the supergroup rights to which the user is bound, ensuring service continuity.

None

Reinstall a host.

This operation will reinstall the software on the specified host and may cause data loss due to the cleanup of the data directory.

▲▲▲

Before performing this operation, ensure that the reinstallation is necessary and exercise caution when selecting the data cleanup option.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Reinstall an instance.

This operation will reinstall the instance on the specified host and may cause data loss due to the cleanup of the data directory.

▲▲▲

Before performing this operation, ensure that the reinstallation is necessary and exercise caution when selecting the data cleanup option.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Restart a service.

Services will be interrupted during the restart. If you select and restart the upper-layer service, the upper-layer services that depend on the service will be interrupted.

▲▲▲

Confirm the necessity of restarting the system before the operation.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Change the default SSH port No.

If the port No. is modified, functions, such as cluster creation, service/instance addition, host addition, or host reinstallation, cannot be used.

▲▲▲

Change the SSH port to the default value 22 before performing related operations.

None

Power off and power on a system.

If a system is powered off and powered on in a non-standard mode, cluster startup will become faulty. For example, LDAP data fails to be synchronized, or controller startup fails.

▲▲▲

Follow the instructions in section System Power-on and Power-off.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

HBase

Modify encryption configuration.

  • hbase.regionserver.wal.encryption
  • hbase.crypto.keyprovider.parameters.uri
  • hbase.crypto.keyprovider.parameters.encryptedtext

This operation may cause service start abnormality.

▲▲▲▲

Strictly follow the prompt information when modifying related configuration items, which are associated. Ensure that new values are valid.

Check whether services can be started properly.

Change the value of hbase.regionserver.wal.encryption to false or switch encryption algorithm from AES to SMS4.

This operation may cause start failures and data loss.

▲▲▲▲

When HFile and WAL are encrypted using an encryption algorithm and a table is created, do not close or switch the encryption algorithm randomly.

If an encryption table (ENCRYPTION=>AES/SMS4) is not created, you can only switch the encryption algorithm.

None

Modify HBase instance start parameter GC_OPTS and HBASE_HEAPSIZE.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid. GC_OPTS does not conflict with HBASE_HEAPSIZE.

Check whether services can be started properly.

Use OfflineMetaRepair tool

This operation may cause service start abnormality.

▲▲▲▲

This command can be used only when HBase is offline and cannot be used in data migration scenarios.

Check whether HBase services can be started properly.

Yarn

Delete or change data directories

yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs

This operation may cause service information loss.

▲▲▲

Do not delete data directories manually.

Check whether data directories are normal.

Spark

Modify configuration items (spark.yarn.queue, and spark.driver.extraJavaOptions)

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Modifying the configuration item (spark.yarn.cluster.driver.extraJavaOptions)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (spark.eventLog.dir)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (SPARK_DAEMON_JAVA_OPTS)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Deleting all JobHistory instances

The event logs of historical applications are lost.

▲▲

Reserve at least one JobHistory instance.

Check whether historical application information is included in JobHistory.

Deleting or modifying /user/spark/lib/6.5.0/spark-assembly-1.5.1-hadoop3.1.1.zip

JDBCServer fails to be started and service functions are abnormal.

▲▲▲

Delete /user/spark/lib/6.5.0/spark-assembly-1.5.1-hadoop3.1.1.zip, and wait for 10-15 minutes until the .zip package is automatically restored.

Check whether the services are started properly.

Spark2x

Modifying the configuration item (spark.yarn.queue)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (spark.driver.extraJavaOptions)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (spark.yarn.cluster.driver.extraJavaOptions)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (spark.eventLog.dir)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Modifying the configuration item (SPARK_DAEMON_JAVA_OPTS)

Services fail to be started.

▲▲

When modifying related configuration items, ensure that the new values are valid.

Check whether the services are started properly.

Deleting all JobHistory2x instances

The event logs of historical applications are lost.

▲▲

Reserve at least one JobHistory2x instance.

Check whether historical application information is included in JobHistory2x.

Deleting or modifying /user/spark2x/jars/6.5.0/spark-archive-2x.zip

JDBCServer2x fails to be started and service functions are abnormal.

▲▲▲

Delete /user/spark2x/jars/6.5.0/spark-archive-2x.zip, and wait for 10-15 minutes until the .zip package is automatically restored.

Check whether the services are started properly.

ZooKeeper

Delete or change ZooKeeper data directories.

This operation may cause service information loss.

▲▲▲

Follow the capacity expansion guide to change the ZooKeeper data directories.

Check whether services and associated components are started properly.

Modify the ZooKeeper instance start parameter GC_OPTS.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Modify the znode ACL information in ZooKeeper.

If znode permission is modified in ZooKeeper, other users may have no permission to access the znode and some system functions are abnormal.

▲▲▲▲

During the modification, strictly follow the ZooKeeper Configuration Guide and ensure that other components can use ZooKeeper properly after ACL information modification.

Check that other components that depend on ZooKeeper can properly start and provide services.

HDFS

Change HDFS NameNode data storage directory dfs.namenode.name.dir

and data configuration directory dfs.datanode.data.dir.

This operation may cause service start abnormality.

▲▲▲▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Modify the HDFS instance start parameter GC_OPTS, HADOOP_HEAPSIZE, and GC_PROFILE.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid. GC_OPTS does not conflict with HADOOP_HEAPSIZE.

Check whether services can be started properly.

Change the default value of dfs.replication from 3 to 1.

This operation will have the following impacts:

1. The storage reliability deteriorates. If the disk becomes faulty, data will be lost.

2. NameNode fails to be restarted, and the HDFS service is unavailable.

▲▲▲▲

When modifying related configuration items, check the parameter description carefully. Ensure that there are more than two replicas for data storage.

Check whether the default replica number is not 1 and whether the HDFS service is normal.

Change the RPC channel encryption mode of each module in Hadoop.

This operation causes service faults and service exceptions.

▲▲▲▲▲

Strictly follow the configuration guide, and make sure that the modified value is valid.

Check whether HDFS and other services that depend on HDFS can properly start and provide services.

Loader

Change the floating IP address of a Loader instance (loader.float.ip).

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether the Loader UI can be connected properly.

Modify the Loader instance start parameter LOADER_GC_OPTS.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Clear table contents when adding data to HBase.

This operation will clear original data in the target table.

▲▲

Ensure that the contents in the target table can be cleared before the operation.

Check whether the contents in the target table can be cleared before the operation.

FTP-Server

Modify the FTP-Server instance start parameter GC_OPTS.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Change the value of ftp-server-ip.

This operation may cause service start abnormality.

▲▲

Change the IP address based on the actual environment.

Check whether services can be started properly.

Flume

Modify the Flume instance start parameter GC_OPTS.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Change the default value of dfs.replication from 3 to 1.

This operation will have the following impacts:

1. The storage reliability deteriorates. If the disk becomes faulty, data will be lost.

2. NameNode fails to be restarted, and the HDFS service is unavailable.

▲▲▲▲

When modifying related configuration items, check the parameter description carefully. Ensure that there are more than two replicas for data storage.

Check whether the default replica number is not 1 and whether the HDFS service is normal.

Solr

Modify Solr instance port parameters (SOLR_PORT, SOLR_SEC_PORT, and SOLR_CONTROL_PORT).

Maloperation will cause instance start and stop abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether service instances can be started or stopped properly.

Modify the Solr parameter INDEX_STORED_ON_HDFS.

  • If the configuration of the Collection in the configuration set solrconfig.xml is <directoryFactory name=""DirectoryFactory""class=""${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"">, when the INDEX_STORED_ON_HDFS parameter is modified, the index storage location of the Collection that adopts the configuration changes, and the Collection needs to be indexed again. Index data in the original storage location will not be deleted automatically.
  • If the configuration of the Collection in the configuration set solrconfig.xml is <directoryFactory name=""DirectoryFactory""class=""solr.NRTCachingDirectoryFactory"">, the index of the Collection that adopts the configuration will not be affected by the modification of the INDEX_STORED_ON_HDFS parameter.

▲▲▲

Identify the Collection that will be affected when the parameter is modified.

  • To avoid the effect caused by the parameter, change <directoryFactory name=""DirectoryFactory""class=""${solr.directoryFactory:solr.NRTCachingDirectoryFactory}""> in the configuration set solrconfig.xml of the Collection to <directoryFactoryname=""DirectoryFactory"" class=""solr.NRTCachingDirectoryFactory"">.
  • Index the Collection that is affected again.

None

Delete two SolrServerAdmin instances at the same time.

The operation may cause unavailable services.

▲▲▲

Only one instance can be deleted. Add a new one as soon as possible if one instance is deleted.

To implement instance migration, see the Cluster Component Migration Guide and perform related operations. You must migrate the two instances at the same time and migrate data in advance.

Check whether services are operating normally.

Modify HBase instance start parameter SOLR_GC_OPTS and SOLR_HEAPSIZE.

This operation may cause service start abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid. SOLR_GC_OPTS does not conflict with SOLR_HEAPSIZE.

Check whether services can be started properly.

Kafka

Delete Topic

This operation may delete existing topics and data.

▲▲▲

Kerberos authentication is used to ensure that authenticated users have operation permissions. Ensure that topic names are correct.

Check whether topics are processed properly.

Delete data directories.

This operation may cause service information loss.

▲▲▲

Do not delete data directories manually.

Check whether data directories are normal.

Modify data directory content (file and folder creation).

This operation may cause the Broker instance of the node faults.

▲▲▲

Do not create or modify files or folders in the data directories manually.

Check whether data directories are normal.

Modify the disk auto-adaptation function using the disk.adapter.enable parameter.

This operation adjusts the topic data retention period when the disk usage reaches the threshold. Historical data that does not fall within the storage retention may be deleted.

▲▲▲

If the retention period of some topics cannot be adjusted, add this topic to the value of disk.adapter.topic.blacklist.

Observe the data storage period on the Kafka topic monitoring page.

Modify data directory log.dirs configuration.

Incorrect operation may cause process faults.

▲▲▲

Ensure that the added or modified data directories are empty and that the directory permissions are right.

Check whether data directories are normal.

Reduce the capacity of the Kafka cluster.

This operation may cause quantity reduction of backups of some data duplicates of topic. As a result, some topics cannot be accessed.

▲▲

Perform backup operation and then reduce the capacity of the Kafka cluster.

Check whether backup nodes where partitions are located are activated to ensure data security.

Start or stop basic components independently.

This operation has adverse impact on the basic functions of some services. As a result, service failures occur.

▲▲▲

Do not start or stop ZooKeeper, Kerberos, and LDAP basic components independently. Select related services when performing this operation.

Check whether services are operating normally.

Restart or stop services.

This operation may interrupt services.

▲▲

Restart or stop services if necessary.

Check whether services are operating normally.

Modify configuration parameters.

This operation requires service restart for configuration to take effect.

▲▲

Modify configuration if necessary.

Check whether services are operating normally.

Deleting or modifying metadata

Modifying or deleting Kafka metadata on ZooKeeper may cause the Kafka topic or service unavailability.

▲▲▲

Do not delete or modify Kafka metadata stored on ZooKeeper.

Check whether the Kafka topics or Kafka service is available.

Deleting metadata backup files

After Kafka metadata backup files are modified and used to restore Kafka metadata, Kafka topics or the Kafka service may be unavailable.

▲▲▲

Do not modify metadata backup files.

Check whether the Kafka topics or Kafka service is available.

Hive

Modify the Hive instance start parameter GC_OPTS.

This operation may cause Hive instance start failures.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Delete all MetaStore instances.

This operation may cause Hive metadata loss. As a result, Hive cannot provide services.

▲▲▲

Do not perform this operation unless ensure that Hive table information can be discarded.

Check whether services can be started properly.

Delete or modify files corresponding to Hive tables over HDFS interfaces or HBase interfaces.

This operation may cause Hive service data loss or tampering.

▲▲

Do not perform this operation unless ensure that the data can be discarded or that the operation meets service requirements.

Check whether Hive data is complete.

Delete or modify files corresponding to Hive tables or directory access permission over HDFS interfaces or HBase interfaces.

This operation may cause unavailable service scenarios.

▲▲▲

Do not perform this operation.

Check whether related service operations are normal.

Delete or modify hdfs:///apps/templeton/hive-1.1.0.tar.gz over HDFS interfaces.

WebHCat fails to perform services due to this operation.

▲▲

Do not perform this operation.

Check whether related service operations are normal.

Export table data to overwrite the data at the local. For example, export the data of t1 to /opt/dir.

insert overwrite local directory '/opt/dir' select * from t1;

This operation will delete target directories. Incorrect setting may cause software or OS startup failures.

▲▲▲▲▲

Ensure that the path where the data is written does not contain any files or do not use the key word overwrite in the command.

Check whether files in the target path are lost.

Direct different databases, tables, or partition files to the same path, for example, defaul warehouse path /user/hive/warehouse.

The creation operation may cause disordered data. After a database, table, or partition is deleted, other object data will be lost.

▲▲▲▲▲

Do not perform this operation.

Check whether files in the target path are lost.

KrbServer

Modify the KADMIN_PORT parameter of KrbServer.

After this parameter is modified, if the KrbServer service and its associated services are not restarted in a timely manner, the configuration of KrbClient in the cluster is abnormal and the service running is affected.

▲▲▲▲▲

After this parameter is modified, restart the KerberosServer service and all its associated services.

None

Modify the kdc_ports parameter of KrbServer.

After this parameter is modified, if the KrbServer service and its associated services are not restarted in a timely manner, the configuration of KrbClient in the cluster is abnormal and the service running is affected.

▲▲▲▲▲

After this parameter is modified, restart the KrbServer service and all its associated services.

None

Modify the KPASSWD_PORT parameter of KrbServer.

After this parameter is modified, if the KrbServer service and its associated services are not restarted in a timely manner, the configuration of KrbClient in the cluster is abnormal and the service running is affected.

▲▲▲▲▲

After this parameter is modified, restart the KrbServer service and all its associated services.

None

Modify the default_realm parameter of KrbServer.

After this parameter is modified, if the KrbServer service and its associated services are not restarted in a timely manner, the configuration of KrbClient in the cluster is abnormal and the service running is affected.

▲▲▲▲▲

After this parameter is modified, restart the KrbServer service and all its associated services.

None

Configuring cross-cluster mutual trust relationships

This operation will restart the KrbServer service and all associated services, affecting the management and maintenance and services of the cluster.

▲▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

LdapServer

Modify the LDAP_SERVER_PORT parameter of LdapServer.

After this parameter is modified, if the LdapServer service and its associated services are not restarted in a timely manner, the configuration of LdapClient in the cluster is abnormal and the service running is affected.

▲▲▲▲▲

After this parameter is modified, restart the LdapServerservice and all its associated services.

None

Restoring LdapServer data

This operation will restart FusionInsight Manager and the entire cluster, affecting the management and maintenance and services of the cluster.

▲▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Replacing the Node where LdapServer resides

This operation will interrupt services deployed on the node. If the node is a management node, the operation will restart all OMS processes, affecting the cluster management and maintenance.

▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

Check whether alarms that are not cleared exist, and whether the management and maintenance of the cluster are normal and whether services are normal.

Changing the password of LdapServer

The LdapServer and Kerberos services need to be restarted during the password change, affecting the management, maintenance, and services of the cluster.

▲▲▲▲

Before performing the operation, ensure that the operation is necessary, and that no other management and maintenance operations are performed at the same time.

N/A

Restarting the node causes LdapServer data damage

Restarting the node without stopping the LdapServer service may cause LdapServer data damage.

▲▲▲▲▲

Restore LdapServer using LdapServer backup data

N/A

Storm

Modify the following plug-in related configuration items:

storm.scheduler

nimbus.authorizer

storm.thrift.transport

nimbus.blobstore.class

nimbus.topology.validator

storm.principal.tolocal

This operation may cause service startup abnormality.

▲▲▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that the class names exist and are valid.

Check whether services can be started properly.

Modify the following startup parameters of Storm instances:

GC_OPTS

NIMBUS_GC_OPTS

SUPERVISOR_GC_OPTS

UI_GC_OPTS

LOGVIEWER_GC_OPTS

This operation may cause service startup abnormality.

▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that new values are valid.

Check whether services can be started properly.

Modify the configuration parameter resource.aware.scheduler.user.pool of the user's resource pool.

Services cannot run properly.

▲▲▲

Strictly follow the prompt information when modifying related configuration items. Ensure that resources allocated to each user are appropriate and valid.

Check whether services can be started and run properly

Delete all Supervisor instances.

This operation may cause service running failures because no resources are available.

▲▲▲

Ensure that there are sufficient Supervisor instances for service running.

None

Changing data directories

If this operation is not properly performed, services may be abnormal and unavailable.

▲▲▲▲

Do not manually change data directories.

Check whether the related data directories are normal.

Restarting services or instances

The service will be interrupted for a short period of time, and ongoing operations will be interrupted.

▲▲▲

Restart services or instances when necessary.

Check whether the service is running properly and whether interrupted operations are restored.

Synchronizing configurations (by restarting the required service)

The service will be restarted, resulting in temporary service interruption. If Supervisor is restarted, ongoing operations will be interrupted for a short period of time.

▲▲▲

Modify configurations when necessary.

Check whether the service is running properly and interrupted operations are restored.

Stopping services or instances

The service will be stopped, and related operations will be interrupted.

▲▲▲

Stop services when necessary.

Check whether the services are properly stopped.

Deleting or modifying metadata

If Nimbus metadata is deleted, services are abnormal and ongoing operations are lost.

▲▲▲▲▲

Do not manually delete Nimbus metadata files.

Check whether Nimbus metadata files are normal.

Modifying file permissions

If permissions on the metadata and log directories are incorrectly modified, service exceptions may occur.

▲▲▲▲

Do not manually modify file permissions.

Check whether the permissions on the data and log directories are correct.

Deleting topologies

Topologies in use will be deleted.

▲▲▲▲

Delete topologies when necessary.

Check whether the topologies are successfully deleted.

DBService

Changing the DBService password

The services need to be restarted for the password change to take effect. The services are unavailable during the restart.

▲▲▲▲

Confirm the necessity of changing the password, and ensure no other O&M operations are performed when the password is changed.

Check whether there are alarms that are not cleared and whether the cluster management and maintenance are normal.

Restoring DBService Data

After the data is restored, the data generated between the backup point in time and the restoration point in time is lost. After the data is restored, the configuration of the components that depend on DBService may expire and these components need to be restarted.

▲▲▲▲

Confirm the necessity of restoring data, and ensure no other O&M operations are performed when data is restored.

Check whether there are alarms that are not cleared and whether the cluster management and maintenance are normal.

Deleting DBService instances

Data may be lost and cannot be restored, service running may be faulty, or other instance configurations may expire.

▲▲▲

Confirm the necessity of deleting DBService instances, and ensure no other O&M operations are performed when the password is changed.

Check whether there are alarms that are not cleared and whether the cluster management and maintenance are normal.

Performing active/standby DBService switchover

During the DBServer switchover, DBService is unavailable.

▲▲

Confirm the necessity of performing active/standby DBService switchover, and ensure no other O&M operations are performed during the switchover.

None

Changing the DBService floating IP address

The DBService needs to be restarted for the change to take effect. The DBService is unavailable during the restart. If the floating IP address has been used, the configuration will fail, and the DBService will fail to be started.

▲▲▲▲

Strictly follow the configuration guide, and make sure that the new floating IP address is valid.

Check whether DBService is properly started.

Flink

Changing the log level

If the log level is modified to DEBUG, the task running performance is affected.

▲▲

Before the modification, confirm the necessity of the operation and change it back to the default log level in time.

None

Modifying file permissions

Tasks may fail.

▲▲▲

Confirm the necessity of the operation before the modification.

Check whether related service operations are normal.

Streaming

Changing data directories

If this operation is not properly performed, services may be abnormal and unavailable.

▲▲▲▲

Do not manually change data directories.

Check whether the related data directories are normal.

Restarting services or instances

The service will be interrupted for a short period of time, and ongoing operations will be interrupted.

▲▲▲

Restart services or instances when necessary.

Check whether the service is running properly and whether interrupted operations are restored.

Synchronizing configurations (by restarting the required service)

The service will be restarted, resulting in temporary service interruption. If Supervisor is restarted, ongoing operations will be interrupted for a short period of time.

▲▲▲

Modify configurations when necessary.

Check whether the service is running properly and interrupted operations are restored.

Stopping services or instances

The service will be stopped, and related operations will be interrupted.

▲▲▲

Stop services when necessary.

Check whether the services are properly stopped.

Deleting or modifying metadata

If Nimbus metadata is deleted, services are abnormal and ongoing operations are lost.

▲▲▲▲▲

Do not manually delete Nimbus metadata files.

Check whether Nimbus metadata files are normal.

Modifying file permissions

If permissions on the metadata and log directories are incorrectly modified, service exceptions may occur.

▲▲▲▲

Do not manually modify file permissions.

Check whether the permissions on the data and log directories are correct.

Deleting topologies

Topologies in use will be deleted.

▲▲▲▲

Delete topologies when necessary.

Check whether the topologies are successfully deleted.

Redis

Changing the memory size

If Redis occupies too much memory. As a result, the process may be stopped by the OS, or OOM may occur.

▲▲▲▲

Properly plan the maximum memory size of each process based on the OS memory to prevent OOM.

Check whether the service is running properly.

Restarting services or instances

This operation will interrupt ongoing operations.

▲▲

Restart or stop services when necessary.

Check whether the service is running properly and whether interrupted operations are restored.

Synchronizing configurations (by restarting the required service)

During the restart, the Redis cluster where the instance resides cannot provide services.

▲▲

Restart or stop services when necessary.

Check whether the service is running properly and whether interrupted operations are restored.

Stopping services or instances

This operation will interrupt ongoing operations.

▲▲

Restart or stop services when necessary.

Check whether the services are properly stopped.

Deleting services or instances

Data stored in Redis is lost, and Redis cannot provide services.

▲▲▲

Do not perform this operation unless data stored in Redis needs to be discarded.

Check whether the service is successfully deleted.

Performing cluster capacity expansion or reduction

Read and write of some service data may become faulty because data is migrated.

▲▲▲

Minimize the Redis cluster load or ensure that the Redis cluster is not providing services as possible.

Check whether cluster capacity expansion or reduction is successful.

Changing a password

During password change, Redis cannot provide services.

▲▲

Change the password when necessary.

Use the new password for login again and check whether the login is successful.

Deleting a Redis cluster

Data stored in the Redis cluster will be lost, and services that access the cluster will be interrupted.

▲▲▲

Do not perform this operation unless data stored in the Redis cluster needs to be discarded.

Check whether the Redis cluster is deleted.

Deleting or modifying a backup file

The Redis process will be restarted or become faulty, and data will be lost.

▲▲▲▲

Do not perform this operation.

Check whether the Redis cluster is normal and whether data is lost.

Deleting or modifying metadata

The Redis service will be faulty.

▲▲▲▲

Do not perform this operation.

Check whether the Redis service is normal.

Modifying file permissions

The Redis service may be faulty.

▲▲▲▲

Do not perform this operation.

Check whether related service operations are normal.

Deleting or modifying a file

The Redis service may be faulty.

▲▲▲▲

Do not perform this operation.

Check whether the Redis service is normal.

Download
Updated: 2019-05-17

Document ID: EDOC1100074522

Views: 5897

Downloads: 12

Average rating:
This Document Applies to these Products
Related Documents
Related Version
Share
Previous Next