No relevant resource is found in the selected language.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>

Reminder

To have a better experience, please upgrade your IE browser.

upgrade

Alarms Are Generated When the Flash Usage for the E6000 SMM Is Too High

Publication Date:  2015-06-19 Views:  217 Downloads:  0
Issue Description
Environment configuration:
Shelf management module (SMM): 2.07

Symptom:

During preventive maintenance inspection (PMI) in an office, alarms are generated when the flash usage for the SMM is too high, as shown in Figure 1.

Figure 1 Alarms for the SMM of the too high flash usage

Handling Process
1.  Log in to the SMM on the CLI, and run the df -ah command to check the current flash usage for the SMM. The root directory usage is 85%, as shown in Figure 2. The flash alarm threshold is 85%, which is the alarm threshold.

Figure 2 Viewing the current flash usage



2.  Run the du -sh * command in the root directory to view the sizes of folders and files in the root directory, as shown in Figure 3.

Figure 3 Sizes of files for a faulty SMM



3.  Find an SMM for which no alarm on the too high flash usage exists, and view the sizes of folders and files in the root directory by using the same method, as shown in Figure 4.

Figure 4 Sizes of files for a normal SMM



4.  Compare the sizes of folders and files in the root directory between two SMMs, and find that the flash usages for each folder and files are the same, and no abnormal file exists.

5..  Since no abnormal file exists, the largest possibility is that a process constantly runs, and resources are not released in time. Processes run on the SMM are as follows:
  • Component monitoring and log output processes include vpem.out, vfan.out, and vnem.out.
  • Processes related to the web process include httpd and snmpd.
6.  Run the top command to view the CPU usage, and find that the load average is lower than 5, and the CPU usage is lower than 80%, indicating that the CPU load is small, as shown in Figure 5. No abnormal process exists based on the CPU usage. You need to use other methods to check all processes.

Figure 5 CPU usage




7.  To check key processes, run the ps-ef command to query the program IDs (PIDs) of the monitoring processes for fans, power supplies, and switch modules, as shown in Figure 6. Numbers in red boxes indicate PIDs.

NOTE:
Only an active SMM can be used to query the monitoring processes of fans, power supplies, and switch blades.

Figure 6 PIDs of fans, power supplies, and switch blades



8.  Run the kill command to end the monitoring processes of fans, power supplies, and switch blades. Then run the df-ah command to check that the root directory usage is 85%.

9.  After you use the same method to end the httpd process, the root directory usage is 85%.

10.  When you use the same method to end the snmpd process, the root directory usage is reduced from 85% to 75%. After running the df-ah command repeatedly, find that the flash usage for the SMM gradually falls to 64% (about 65% in normal condition), as shown in Figure 7. Therefore, the snmpd process is abnormal.

Figure 7 Flash usage for the SMM



11.  Simulate the problem by using fault injection. By using a dedicated software tool, a large number of web files are found opened and then closed; however, the snmpd process does not release occupied flash resources in time.
Root Cause
Many web files in the SMM software are found opened and then closed, which makes the snmpd process constantly occupy SMM flash resources. Therefore, the flash usage is too high.

Solution
Use Telnet to log in to the active and standby SMMs separately, and run the reboot command to restart them.

Suggestions
Note:
Methods of identifying the active and standby SMMs are as follows:

Run the smmget [-l smm ] -d redundancy command to determine the active SMM. In the command output, the SMM in active state is the active SMM. From the front of the E6000 shelf, the SMM in the upper left corner is SMM 1, and that in the upper right corner is SMM 2.

root@SMM:/#smmget -d redundancy
The Redundancy States of SMMs:
SMM1: Present(active)*
SMM2: Present(standby)
* = The SMM you are currently logged into.


Identify the active and standby SMMs based on the ACT indicator status, as shown in Table 1.

Table 1 Relationships between the ACT indicator status and the active and standby SMMs 

END