Atlas 300I Inference Card 24.1.0 NPU Driver and Firmware Installation Guide (Model 3010) 03

Installing a Driver (*.run)

Installing a Driver (*.run)

This section uses {product name}-npu-driver_x.x.x_linux-{arch}.run as an example to describe package installation on a host. Replace it with the driver package in the actual host system as required.

Prerequisites

Before installing the driver package, you need to check and install related dependencies. For details, see Installing Dependencies.

Precautions

If a driver package whose version is NPU 1.X.X is installed, uninstall it before installing the driver package of NPU 20.X.X or later.

During the installation or upgrade of a software package, do not reset or power off the host or device. Otherwise, device boot or upgrade fails.

Procedure

  • For initial installation, install the driver and then firmware; for overwrite installation or upgrade, install the firmware and then driver. For details about how to install the firmware, see Installing the Firmware (*.run).
  • If the --install-for-all option is used during the initial installation, retain it during any overwrite installation.
  • For the Atlas 300I inference card (model 3010), to ensure that the driver, firmware, and MCU versions match each other, upgrade the MCU. For details, see "Upgrading the MCU" in Atlas 300I Inference Card 24.1.0 NPU Driver and Firmware Upgrade Guide (Models 3000, 3010).
  1. Upload the driver package obtained from Obtaining Software Packages to any directory (for example, /opt) in Linux.
  2. Use PuTTY to log in to the OS CLI of the server. For details, see Logging In to the CLI Using PuTTY over a Network Port.
  3. Run the following command to switch to the root user:

    su - root

  4. Run the following command to go to the directory where the software package is stored, for example, /opt:

    cd /opt

  5. Run the following command to grant the execute permission on the software package:

    chmod +x {product name}-npu-driver_x.x.x_linux-{arch}.run

  6. Run the following command to check the consistency and integrity of the .run installation package:

    ./{product name}-npu-driver_x.x.x_linux-{arch}.run --check

    If the following information is displayed, the driver package verification is successful:
    Verifying archive integrity...  100%   SHA256 checksums are OK. All good.

    The SHA256 encryption mode is used for software package verification. If "./{product name}-npu-driver_x.x.x_linux-{arch}.run does not contain an embedded MD5 checksum." and "./{product name}-npu-driver_x.x.x_linux-{arch}.run does not contain a CRC checksum" are displayed during the verification, the MD5 and CRC encryption modes are not used, and the messages can be ignored.

  7. Install the driver. The default installation path is /usr/local/Ascend.

    ./{product name}-npu-driver_x.x.x_linux-{arch}.run --full
    • Installation log path: /var/log/ascend_seclog/ascend_install.log
    • Path where the installation path, installation command, and running user information of the package after installation are stored: /etc/ascend_install.info
    • During the .run package installation, the dynamic library libdcmi.so and header file dcmi_interface_api.h are copied to the /usr/local/dcmi/ directory.
    • If the created running user is not HwHiAiUser, specify a running user using the --install-username=username --install-usergroup=usergroup parameter when installing the driver package.
    • If the root user is specified as the running user, the --install-for-all parameter must be included in the installation command as follows. In this scenario, security risks may exist. --install-username=root --install-usergroup=root --install-for-all

      --install-username=root --install-usergroup=root --install-for-all

    • If the server does not have a BMC, you can only install the driver in the default path.
    • If physical machines do not support PCIe card hot reset, run the ./{product name}-npu-driver_x.x.x_linux-{arch}.run --full --force command to install the driver. For details about the parameters, see Parameters and Commands.
    • Device-side system logs are transferred to the host by using the msnpureport tool. For details about the export operations and the path for storing exported logs, see "msnpureport Instructions" in Atlas 300I Inference Card 24.1.0 Black Box Log Reference (Models 3000, 3010). In a container, system logs on the device cannot be viewed and cannot be exported using the msnpureport tool.

    If you specify an installation path, for example, /test/HiAI/, complete the installation by running the following command:

    ./{product name}-npu-driver_x.x.x_linux-{arch}.run --full --install-path=/test/HiAI/
    • In the scenario where the specified path does not exist, a directory is automatically created during the installation. If there are multiple levels of directories, the last level is automatically created only when it does not exist.
    • In the scenario where the specified path exists:
      • If the owner of all levels of directories in the path is the root user, ensure that the permission on all levels of directories is at least 755. If the path permission does not meet requirements, run the following command to change the path permission:

        chmod 755 path

      • If the owner of a level-1 directory in the path is not the root user, change the owner to the root user and ensure that the permission on all levels of directories is 755. If the path permission does not meet requirements, run the following command to change the path owner to root:

        chown root: group_name path

    If the following information is displayed, the driver is successfully installed:
    Driver package installed successfully! 

  8. (Optional) If the following information is displayed during the installation, the DKMS has not been installed and the default kernel source code path, for example, /lib/modules/`uname -r`/build, does not exist. Enter the following information as prompted:

    [WARNING]rebuild ko has something wrong, detail in /var/log/ascend_seclog/ascend_rebuild.log
    Do you want to try build driver after input kernel absolute path? [y/n]:

    If you want to continue the installation, enter y.

    When the following information is displayed, enter the actual path of the kernel source code, for example, /lib/modules/`uname -r`/build-bak:

    Please input your kernel absolute path:

    Press Enter to continue the installation.

    • If DKMS and related components such as kernel-header and kernel-devel have been installed, the system automatically compiles and installs the DKMS driver.
    • If DKMS has not been installed but the default kernel source code path (for example, /lib/modules/`uname -r`/build) already exists, the kernel is automatically used for driver compilation.
    • If DKMS is used for installation and information similar to "rmdir: failed to remove 'updates': Directory not empty" is displayed in the log, you can ignore the information because it does not affect the actual installation.
    If the following information is displayed, the driver is successfully installed:
    Driver package installed successfully! 

  9. (Optional) Determine whether to restart the system based on the displayed information. If the system needs to be restarted, run the following command. Otherwise, skip this step.

    reboot

  10. Run the following command to check whether the driver is successfully loaded:

    npu-smi info

    If the following information is displayed, the driver is successfully loaded. The version number and number of devices are subject to the actual situation. Otherwise, the loading fails. Contact Huawei technical support.

    +----------------------------------------------------------------------------------------------------+
    | npu-smi 24.1.0                            Version: 24.1.0                                      |
    +-------------------+-----------------+--------------------------------------------------------------+
    | NPU     Name      | Health          | Power(W)          Temp(C)              Hugepages-Usage(page) |
    | Chip    Device    | Bus-Id          | AICore(%)         Memory-Usage(MB)                           |
    +===================+=================+==============================================================+
    | 6       310       | OK              | 12.8              73             0       / 970               |
    | 0       3         | 0000:89:00.0    | 0                 2703 / 8192                                |
    +-------------------+-----------------+--------------------------------------------------------------+
    | 6       310       | OK              | 12.8              74             0       / 970               |
    | 1       4         | 0000:8A:00.0    | 0                 2867 / 8192                                |
    +-------------------+-----------------+--------------------------------------------------------------+
    | 6       310       | OK              | 12.8              70            0        / 970               |
    | 2       5         | 0000:8B:00.0    | 0                 2867 / 8192                                |
    +-------------------+-----------------+--------------------------------------------------------------+
    | 6       310       | OK              | 12.8              65            0        / 970               |
    | 3       6         | 0000:8C:00.0    | 0                 2867 / 8192                                |
    +===================+=================+==============================================================+
    | No running processes found in NPU 6                                                                |
    +===================+=================+==============================================================+
    • If the loading fails, run the dmesg command to view the Linux startup logs. If "/installation_path/driver/device/davinci_mini.fd copy err" is displayed, uninstall the current driver based on the software package format and use the default path to install the driver again. For details, see Uninstalling a Driver.
    • In the command output, the field following npu-smi indicates the npu-smi tool version, and the field following Version: indicates the driver version.