Overview
Overview
The monitoring framework monitors errors during OS running, and reports the monitored errors to the alarm module. The alarm module then reports alarms to the alarm management platform. The monitoring framework is provided as a service. You can start, stop, restart, or reload this service by running the following commands: systemctl star sysmonitor systemctl stop sysmonitor systemctl restart sysmonitor systemctl reload sysmonitor.
Precautions
- Only one monitoring service can be run at a time.
- Ensure that all configuration files are valid. Otherwise, the monitoring service may be abnormal.
Configuration File
/etc/sysconfig/sysmonitoris the configuration file of the sysmonitor framework (system monitoring service). This file defines the monitoring period of each monitoring item, and specifies whether to enable monitoring and whether to report alarms if errors are detected.
Example configurations:
Spaces are not allowed between the configuration item, equal sign (=), and double quotation mark (").
#Whether to enable monitoring for a key process. PROCESS_MONITOR="on" #Specify the monitoring period (unit: second) for a key process. PROCESS_MONITOR_PERIOD="3" # Specify the interval at which the system attempts to start a key process if the process fails to be restored. PROCESS_RECALL_PERIOD="1" # Timeout interval during the recovery of key processes.(unit: seconds) PROCESS_RESTART_TIMEOUT="90" #Whether to report alarms of a key process. PROCESS_ALARM="off" #Whether to enable filesystem monitoring. FILESYSTEM_MONITOR="off" #Whether to report alarms of the filesystem. FILESYSTEM_ALARM="off" #Whether to enable signal monitoring. SIGNAL_MONITOR="on" #Whether to report signal alarms. SIGNAL_ALARM="off" #Whether to enable disk space monitoring. DISK_MONITOR="on" #Whether to report disk space alarms. DISK_ALARM="off" #Specify the disk monitoring period (unit: second). DISK_MONITOR_PERIOD="60" #Whether to enable NIC monitoring. NETCARD_MONITOR="on" #Whether to report NIC alarms. NETCARD_ALARM="off" #Whether to enable file monitoring. FILE_MONITOR="on" #Whether to report file alarms. FILE_ALARM="off" #Whether to enable CPU usage monitoring. CPU_MONITOR="on" #Whether to report CPU usage alarms. CPU_ALARM="off" #Whether to enable memory usage monitoring. MEM_MONITOR="on" #Whether to report memory usage alarms. MEM_ALARM="off" # Whether to monitor the number of processes (threads). PSCNT_MONITOR="on" # Whether to report alarms related to the process (thread) count. PSCNT_ALARM="off" #Whether to monitor the total number of file descriptors in the system. FDCNT_MONITOR="on" #Whether to report alarms related to the file descriptor count. FDCNT_ALARM="off
Configuration Item |
Description |
Mandatory or Not |
Default Value |
---|---|---|---|
PROCESS_MONITOR |
Whether to enable monitoring for a key process. Options:
|
No |
on |
PROCESS_MONITOR_PERIOD |
Monitoring period for a key process. Unit: second. |
No |
3s |
PROCESS_RECALL_PERIOD |
Interval at which the system attempts to start a key process if the process fails to be restored. Unit: minute. Range: integer from 1 to 1440. |
No |
1 minute |
PROCESS_RESTART_TIMEOUT |
Timeout interval during the recovery of key processes. Unit: seconds. Range: integer from 30 to 300. NOTE:
For example, if the recovery time of the rsyslog process is long because the IP address configured on the DNS server is unreachable, you need to change the value of this parameter. |
No |
90s |
PROCESS_ALARM |
Whether to report alarms of a key process. Options:
|
No |
off |
FILESYSTEM_MONITOR |
Whether to enable monitoring for the ext3 filesystem. Options:
|
No |
off |
FILESYSTEM_ALARM |
Whether to report alarms of the ext3 filesystem. Options:
|
No |
off |
SIGNAL_MONITOR |
Whether to enable signal monitoring. Options:
|
No |
on |
SIGNAL_ALARM |
Whether to report signal alarms. Options:
|
No |
off |
DISK_MONITOR |
Whether to enable disk space monitoring. Options:
|
No |
on |
DISK_ALARM |
Whether to report disk space alarms. Options:
|
No |
off |
DISK_MONITOR_PERIOD |
Disk monitoring period. Unit: second. |
No |
60s |
NETCARD_MONITOR |
Whether to enable NIC monitoring. Options:
|
No |
on |
NETCARD_ALARM |
Whether to report NIC alarms. Options:
|
No |
off |
FILE_MONITOR |
Whether to enable file monitoring. Options:
|
No |
on |
FILE_ALARM |
Whether to report file alarms. Options:
|
No |
off |
CPU_MONITOR |
Whether to enable CPU usage monitoring. Options:
|
No |
on |
CPU_ALARM |
Whether to report CPU usage alarms. Options:
|
No |
off |
MEM_MONITOR |
Whether to enable memory usage monitoring. Options:
|
No |
on |
MEM_ALARM |
Whether to report memory usage alarms. Options:
|
No |
off |
PSCNT_MONITOR |
Whether to monitor the number of processes (threads). Options:
|
No |
on |
PSCNT_ALARM |
Whether to report alarms related to the process (thread) count. Options:
|
No |
off |
FDCNT_MONITOR |
Whether to monitor the total number of file descriptors (FDs). Options:
|
No |
on |
FDCNT_ALARM |
Whether to report alarms related to the FD count. Options:
|
No |
off |
- After the /etc/sysconfig/sysmonitor configuration file is modified, restart the sysmonitor service to make the modifications take effect.
- By default, ext3 filesystem monitoring and alarming are disabled on non-ext3 filesystems.
Monitoring Item |
Configurable or Not |
---|---|
ext3 filesystem faults |
No |
Key process/module faults |
Yes |
Disk space |
Yes |
inode |
Yes |
Files |
Yes |
Signals |
Yes |
NICs |
Yes |
CPU usage |
Yes |
Memory usage |
Yes |
Number of processes |
Yes |
Total number of system FDs |
Yes |
Command Reference
- Starting the monitoring service:
systemctl start sysmonitor
- Stopping the monitoring service:
systemctl stop sysmonitor
- Restarting the monitoring service:
systemctl restart sysmonitor
After the /etc/sysconfig/sysmonitor configuration file is modified, restart the monitoring service to make the modifications take effect.
- Reloading the monitoring service:
systemctl reload sysmonitor
After the /etc/sysconfig/sysmonitor configuration file is modified, you can reload the monitoring service to make the modifications take effect dynamically. However, if the NIC monitoring item is modified, you must restart the monitoring service so that the modifications can take effect.