Clearing the ECC Error Count of a Specified Chip
Function
The npu-smi clear -t ecc-info -i id -c chip_id command is used to clear the ECC error count of a specific chip.
Syntax
npu-smi clear -t ecc-info -i id -c chip_id
Parameters
Parameter |
Description |
---|---|
id |
Device ID. The NPU ID obtained by running the npu-smi info -l command is the device ID. |
chip_id |
Chip ID. If there is only one chip, the chip ID is 0. |
Restrictions
- This command must be run as the root user on a physical machine. If it is run as a non-root user on the physical machine, in a container, or on a VM, an error is reported.
- This command takes effect only for clearing historical ECC statistics and statistics on the isolation page. If a multi-bit ECC error occurs, you need to restart the device before clearing the ECC error count.
- This command can be used only in 20.1.0 and later versions.
Example
# Clear the ECC error count of chip 0 on NPU 2.
npu-smi clear -t ecc-info -i 2 -c 0 Status : OK Message : Clear ecc-info successfully