Atlas 800 Inference Server (Model 3000) 23.0.0 Ascend Software Installation Guide 01

Rapid Rollout

Rapid Rollout

This section describes how to configure the Atlas 300I Pro inference card based on the Atlas 800 inference server (model 3000) to quickly install the Ascend NPU driver and firmware and Compute Architecture for Neural Networks (CANN, AI heterogeneous computing architecture) software and roll out inference services.

Preparing the Installation Environment

  • Before installing dependencies, ensure that the server can connect to the external network and available software sources and pip sources have been configured. For details about how to replace the software source, see Checking the Source Validity. For details about how to configure the pip source, see Configuring the PIP Source.
  • Before installing the driver, you need to create a user (HwHiAiUser) to run the driver process. HwHiAiUser is the default running user of the driver, you do not need to specify the running user when installing the driver.
    groupadd HwHiAiUser
    useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash

    To use the container image pulled from AscendHub, run the following command to create the driver running user HwHiAiUser whose UID and GID are both 1000:

    groupadd -g 1000 HwHiAiUser
    useradd -g HwHiAiUser -u 1000 -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash

    If the following information is displayed, refer to Failed to Create the Driver Running User HwHiAiUser Whose UID and GID Are 1000 to rectify the fault.

    groupadd: GID '1000' already exists

Downloading the Software

Once you download this software, you agree to the terms and conditions of Huawei Enterprise End User License Agreement (EULA).

To query the Ascend software version mapping, click here.

Table 1-1 Software download link

Software Type

Package Name and Download Link

Driver

Click here to download Ascend-hdk-xxx-npu-driver_23.0.1_linux-aarch64.run.

Firmware

Click here to download Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run.

MCU

Click here to download Ascend-hdk-xxx-mcu_23.2.3.zip.

Toolkit (development kit)

Click here to download Ascend-cann-toolkit_7.0.1_linux-aarch64.run.

kernels (binary OPP)

Click here to download Ascend-cann-kernels-xxx_7.0.1_linux.run.

Uploading the Installation Packages and Granting Permissions

  1. Decompress the MCU .zip package to a local folder to obtain the Ascend-hdk-xxx-mcu_23.2.3.hpm installation package.
  2. Upload the downloaded .run package and the .hpm package obtained after decompression to any directory (for example, /home) on the server.
  3. Grant permissions on the installation packages.
    chmod +x Ascend-hdk-xxx-npu-driver_23.0.1_linux-aarch64.run
    chmod +x Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run
    chmod +x Ascend-cann-toolkit_7.0.1_linux-aarch64.run
    chmod +x Ascend-cann-kernels-xxx_7.0.1_linux.run

Installing the NPU Driver and Firmware

Run the uname -r command to query the kernel version of the OS.

  • If the kernel version is within the version range of the binary installation mode, you can directly install the NPU driver and firmware.
  • If the kernel version requires the source code compilation and installation mode, install the dependencies required for driver source code compilation by referring to Installing Dependencies Required for Compiling Driver Source Code, and then install the NPU driver and firmware.
Table 1-2 Kernel version requirements

Host OS Version

Default Host OS Kernel Version

Installation Mode

CentOS 7.6

4.14.0-115.el7a.0.1.aarch64

Binary installation

Ubuntu 18.04.1

4.15.0-29-generic

Ubuntu 20.04

5.4.0-26-generic

Ubuntu 18.04.5

4.15.0-112-generic

Installation from source code

openEuler 20.03 LTS

4.19.90-2003.4.0.0036.oe1.aarch64

openEuler 22.03 LTS

5.10.0-60.18.0.50.oe2203.aarch64

Kylin V10 SP1

4.19.90-17.ky10.aarch64

Kylin V10 SP2

4.19.90-24.4.v2101.ky10.aarch64

CUlinux 3.0

5.10.0-60.67.0.104.ule3.aarch64

# For the Ubuntu OS, install the following dependencies before installing the driver:
apt-get install -y net-tools pciutils
# Install the NPU driver.
./Ascend-hdk-xxx-npu-driver_23.0.1_linux-aarch64.run --full --install-for-all
# Check whether the driver is successfully loaded. If the chip information is displayed, the driver is successfully loaded.
npu-smi info
# Install the NPU firmware.
./Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run --full
# Restart the OS.
reboot

Upgrading the MCU

# Query the NPU ID (device ID of the card).
npu-smi info -l
# Upgrade the MCU.
npu-smi upgrade -t mcu -i NPU ID -f Ascend-hdk-xxx-mcu_23.2.3.hpm
# Make the new version take effect.
npu-smi upgrade -a mcu -i NPU ID
# After the new version takes effect, wait for 30 seconds before checking the MCU version to ensure that the upgrade is successful.
npu-smi upgrade -b mcu -i NPU ID

Installation on a PM

  1. Install dependencies.

    Ubuntu 20.04 is used as an example to describe how to install dependencies. For details about how to install dependencies on other OSs, see Dependency Installation.
    # Install the OS dependencies.
    apt-get install -y gcc g++ make cmake zlib1g zlib1g-dev openssl libsqlite3-dev libssl-dev libffi-dev unzip pciutils net-tools libblas-dev gfortran libblas3 libopenblas-dev
    
    # Install Python3 dependencies.
    apt-get install -y python3-pip
    pip3 install --upgrade pip
    pip3 install attrs cython numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py

  2. Install CANN.

    # Run the following commands in the /home directory to install Toolkit and kernels:
    ./Ascend-cann-toolkit_7.0.1_linux-aarch64.run --install --install-for-all --quiet
    ./Ascend-cann-kernels-xxx_7.0.1_linux.run --install --install-for-all --quiet
    
    # Run the following command to configure environment variables. To make the environment variables take effect permanently, add the following line to the end of the ~/.bashrc file and run the source ~/.bashrc command.
    source /usr/local/Ascend/ascend-toolkit/set_env.sh

  3. (Optional) Run the sample.

    After the software is installed, you can run the sample to check whether the environment is available.

    Obtain the model and weight files.

    • ResNet-50 network model file (*.prototxt): Click here to download the file.
    • ResNet-50 weight file (*.caffemodel): Click here to download the file.
    # Obtain the sample repository code from any directory on the server (the /home directory is used as an example).
    apt-get install git
    git config --global http.sslVerify "false"
    git clone https://gitee.com/ascend/samples.git
    
    # Go to the sample directory. In the following sections, the sample directory refers to the samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification directory.
    cd samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification
    
    # Obtain the original ResNet-50 model.
    pip3 install Pillow
    mkdir -p caffe_model
    
    # Upload the obtained model and weight files to the created caffe_model directory.
    
    # Run the following commands in sample directory to convert the original ResNet-50 model to an offline model (.om file) that adapts to the Ascend AI Processor.
    atc --model=caffe_model/resnet50.prototxt --weight=caffe_model/resnet50.caffemodel --framework=0 --output=model/resnet50 --soc_version=Ascendxxx --input_format=NCHW --input_fp16_nodes=data --output_type=FP32 --out_nodes=prob:0
    
    # Prepare sample images in sample directory/data. If the data directory does not exist, run the mkdir -p data command in sample directory to create it.
    cd data
    wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog1_1024_683.jpg --no-check-certificate
    wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog2_1024_683.jpg --no-check-certificate
    python3 ../script/transferPic.py
    
    # Compilation and running
    # Run the following commands in example directory to configure environment variables.
    export DDK_PATH=/usr/local/Ascend/ascend-toolkit/latest
    export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub
    
    # Run the following commands in sample directory to compile the executable file.
    mkdir -p build/intermediates/host
    cd build/intermediates/host
    cmake ../../../src -DCMAKE_CXX_COMPILER=g++ -DCMAKE_SKIP_RPATH=TRUE
    make
    
    # In sample directory/out, run the following command to run the compiled file.
    ./main

Installing the Container

  • Before pulling a container image from Ascend Hub, ensure that the installation environment can be connected to the network.
  • Ensure that the Docker has been installed on the host machine. (You can run the docker version command to check.) If it is not installed, refer to Deploying the Docker to install it.
  1. Click the inference container image link and download the container image according to Version Mapping.
  2. Obtain the images.

    1. Click the login user name and select the image download credential from the drop-down list.
    2. Click the Image Versions tab, download the container image corresponding to your software version, and click Download.
    3. Copy the command for obtaining the permission as prompted, run the command on the host, and enter the image download credential.
    4. Copy the image download command and run the command on the host to pull the image.
      If an error similar to the following occurs when you configure the permission to log in to Ascend Hub for image download, perform the following operations to rectify the fault:
      Error response from daemon: Get https://ascendhub.huawei.com/v2/: x509: certificate signed by unknown authority

      Run the vi /etc/docker/daemon.json command and add the Ascend Hub address to insecure-registries in the file, as shown in the following information in bold:

      {
              "registry-mirrors": ["http://docker.mirrors.ustc.edu.cn"],
              "insecure-registries": ["docker.mirrors.ustc.edu.cn", "ascendhub-registry.rnd.huawei.com", "registry.docker-cn.com", "ustc-edu-cn.mirror.aliyuncs.com","ascendhub.huawei.com"],
              "experimental" : true
      }
      After adding the content, run the following commands to restart the Docker:
      systemctl daemon-reload 
      systemctl restart docker

  3. Click Image Overview and follow the procedure in How to Use an Image to start the container.

    You can run the ls /dev/ | grep davinci* command to query the available NPUs (for example, /dev/davinci0) on the host and mount the available NPUs to the container when starting the container.