ALM-37016 XLog Archive Command Fails to Be Executed on the MPPDBServer
Description
This alarm is generated when the XLog archive command fails to be executed for the Coordinator instance or DataNode instance in a cluster.
This alarm is automatically cleared when the XLog archive command is executed successfully.
Attribute
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
37016 |
Major |
Yes |
Parameters
Name |
Meaning |
---|---|
ServiceName |
Identifies the service for which the alarm is generated. |
RoleName |
Identifies the role for which the alarm is generated. |
HostName |
Identifies the host for which the alarm is generated. |
Instance |
Identifies the instance for which the alarm is generated. |
Impact on the System
Some Xlog logs may not be archived to the archive directory specified in the archive command. The number of Xlog logs keeps increasing, resulting in insufficient disk space.
Possible Causes
- The archive command is incorrect.
- Users do not have the write permission of the archive directory specified in the archive command.
- The archive directory specified in the archive command does not exist.
- The space of the archive directory specified in the archive command is full. Therefore, data cannot be written into the directory.
Procedure
Modify the archive configuration.
- Log in to the FusionInsight Manager.
- Log in to the ManageOne OM plane using a browser, then choose Alarms.
- Login address: https://URL for the homepage of the ManageOne OM plane:31943. Example: https://oc.type.com:31943.
- Default username: admin, default password: Huawei12#$.
- In the alarm list, locate and click the target alarm name in the Name column. The Alarm Details and Handling Recommendations dialog box is displayed.
- Locate the value in the IP Address/URL/Domain Name column, which is the float IP address of the FusionInsight Manager.
- Log in to the FusionInsight Manager using a browser.
- Login address: https://float IP address of the FusionInsight Manager:28443/web. Example: https://10.10.192.100:28443/web.
- Default username: admin, default password: obtain it from the system administrator.
- Log in to the ManageOne OM plane using a browser, then choose Alarms.
- On FusionInsight Manager, click Alarms. On the alarm list, locate the alarm and obtain the information about the node and instance for which the alarm is generated from Location in the Alarm Details area.
- Use PuTTY to log in to the node for which the alarm is generated as user root or omm.
NOTE:
- omm user: Default user: omm, default password: Bigdata123@.
- root user: The password is specified by users before the installation. Obtain it from the system administrator.
- In the postgresql.conf file (for example, /srv/BigData/mppdb/data1/master1/postgresql.conf) of the Coordinator instance or active DataNode instance, check whether the syntax of archive_command is incorrect.
- Correct the command. Wait for 5 minutes, and check whether the alarm reoccurs.
For the command details, see the comment in postgresql.conf.
- If yes, go to 6.
- If no, no further action is required.
- Check whether the archive directory specified in the archive command does not exist, whether you do not have the write permission, or whether the space is full.
- Resolve the problem based on the check result and ensure that the archive directory can be written properly. Wait for 5 minutes, and check whether the alarm reoccurs.
- If yes, go to 8.
- If no, no further action is required.
Collect fault information.
- On FusionInsight Manager, choose .
- Select MPPDB from the Services drop-down list box and click OK.
- Set Start Time for log collection to 1 hour ahead of the alarm generation time and End Time to 1 hour after the alarm generation time, and click Download.
- Contact Technical Support and send the collected logs.
Alarm Clearing
After the fault is rectified, the system automatically clears this alarm.
Related Information
None