What Do I Do If Flush of Logs Fails?
If the flush of logs failed, perform the following steps:
- Run the following command to check whether the log processes (slogd, sklogd, and log-daemon) on the host are started:
ps -elf | grep log
If the process information is displayed, the processes have been started. Otherwise, perform the following steps to manually start the log process:
- Switch to a common user (for example, the HwHiAiUser user):
su HwHiAiUser
- Restart these log processes:
- Start the slogd process:
nohup /usr/local/Ascend/driver/tools/slogd > /dev/null 2>&1 &
- Start the sklogd process:
nohup /usr/local/Ascend/driver/tools/sklogd > /dev/null 2>&1 &
- Start the log-daemon process:
nohup /usr/local/Ascend/driver/tools/log-daemon > /dev/null 2>&1 &
Replace /usr/local/Ascend in the preceding commands with the default Driver installation path.
- Start the slogd process:
- Check whether the log process is successfully started:
ps -elf | grep log
- Switch to a common user (for example, the HwHiAiUser user):
- Run the following command to check whether maintenance and test logs of the Log tool contain alarm or error messages:
cat /var/log/npu/slog/slogd/slogdlog
- Validate the permission and owner of the log directories and configuration file on the host.Table 2-1 shows the permission and owner of the log directories and configuration file. The owner is the HwHiAiUser user, as shown in Figure 2-1.Table 2-1 Log directories and configuration file
Log Directory or Configuration File
Description
Owner
Permission
/var/log/npu/slog/
Root directory for flushing logs
HwHiAiUser
750
/var/log/npu/slog/host-0/
Log flush directory on the host
HwHiAiUser
750
/var/log/npu/slog/device-os-id/
Ctrl CPU log flush directory on the device
HwHiAiUser
750
/var/log/npu/slog/device-id/
Non-Ctrl CPU log flush directory on the device
HwHiAiUser
750
/usr/slog/slog/
Socket channel directory (channel directory for local communication of the slogd process)
HwHiAiUser
660
/var/log/npu/conf/slog/slog.conf
Log configuration file
HwHiAiUser
640
- Run the following command to check whether the space of the log working directory (/usr/slog) and log flush directories (for example, /var/log/npu/slog) are sufficient:
df -h
- If the log flush still fails, perform the following operations:
- If the host-side logs fail to be flushed, restart the process on the host. Run the following command:
kill -15 pid
Replace pid with actual process ID. You can run the ps -elf | grep log command to query the log process IDs (slogd, sklogd, and log-daemon). After restarting the process, manually start the log process by referring to 1.
- If the logs on the device still fail to be flushed, log in to the device and run the following command to check whether the log processes (slogd and sklogd) on the device have been started (on the condition that you have the permission to log in to the device).
ps -elf | grep log
If process information is displayed, the log processes (slogd and sklogd) on the device have been started. Otherwise, perform the following steps to manually start the log process:- Switch to a common user (for example, the HwHiAiUser user):
su HwHiAiUser
- Manually start the slogd process on the device:
nohup /var/slogd > /dev/null 2>&1 &
- Manually start the sklogd process on the device:
nohup /var/sklogd > /dev/null 2>&1 &
- Check whether the process is successfully started:
ps -elf | grep log
- Switch to a common user (for example, the HwHiAiUser user):
- If the host-side logs fail to be flushed, restart the process on the host. Run the following command: