Atlas 800 Inference Server (Model 3010) 23.0.0 Ascend Software Installation Guide 01
Quick Rollout
This section describes how to configure the Atlas 300I Pro inference cardA300I Pro inference card for the Atlas 800 inference server (model 3010) to quickly install the Ascend NPU driver firmware and CANN (AI heterogeneous computing architecture) software and roll out inference services.
Preparing the Installation Environment
- Before installing dependencies, ensure that the server can connect to the external network and available software sources and pip sources have been configured. For details about how to replace the software source, see Source Validity Check. For details about how to configure the pip source, see Configuring the PIP Source.
- Before installing the driver, you need to create a user (HwHiAiUser) to run the driver process. HwHiAiUser is the default running user of the driver, you do not need to specify the running user when installing the driver.
groupadd HwHiAiUser useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
To use the container image pulled from AscendHub, run the following command to create the driver running user HwHiAiUser whose UID and GID are both 1000:
groupadd -g 1000 HwHiAiUser useradd -g HwHiAiUser -u 1000 -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
If the following information is displayed, refer to Failed to Create the Driver Running User HwHiAiUser Whose UID and GID Are 1000 to rectify the fault.
groupadd: GID '1000' already exists
Downloading the Software
Once you download this software, you agree to the terms and conditions of Huawei Enterprise End User License Agreement (EULA).
To query the Ascend software version mapping, click here.
Software Type |
Package Name and Download Link |
---|---|
Driver |
Click here to download Ascend-hdk-xxx-npu-driver_23.0.1_linux-x86-64.run. |
Firmware |
Click here to download Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run. |
MCU |
Click here to download Ascend-hdk-xxx-mcu_23.2.3.zip. |
Toolkit (development kit) |
Click here to download Ascend-cann-toolkit_7.0.1_linux-x86_64.run. |
kernels (binary OPP) |
Click here to download Ascend-cann-kernels-xxx_7.0.1_linux.run. |
Uploading the Installation Packages and Granting Permissions
- Decompress the MCU .zip package to a local folder to obtain the Ascend-hdk-xxx-mcu_23.2.3.hpm installation package.
- Upload the downloaded .run package and the .hpm package obtained after decompression to any directory (for example, /home) on the server.
- Grant permissions on the installation packages.
chmod +x Ascend-hdk-xxx-npu-driver_23.0.1_linux-x86-64.run chmod +x Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run chmod +x Ascend-cann-toolkit_7.0.1_linux-x86_64.run chmod +x Ascend-cann-kernels-xxx_7.0.1_linux.run
Installing the NPU Driver and Firmware
For the SLES 12.5 OS, set allow_unsupported_modules in the /etc/modprobe.d/10-unsupported-modules.conf configuration file to 1 before installing the driver firmware.
Run the uname -r command to query the kernel version of the OS.
- If the kernel version is within the version range of the binary installation mode, you can directly install the NPU driver and firmware.
- If the kernel version requires the source code compilation and installation mode, install the dependencies required for driver source code compilation by referring to Installing Dependencies Required for Compiling Driver Source Code, and then install the NPU driver and firmware.
Host OS Version |
Default Host OS Kernel Version |
Installation Mode |
---|---|---|
CentOS 7.6 |
3.10.0-957.el7.x86_64 |
Binary installation |
Ubuntu 20.04 |
5.4.0-26-generic |
|
CentOS 8.0 |
4.18.0-80.el8.x86_64 |
Installation from source code |
SLES 12.5 |
4.12.14-120-default |
|
openEuler 20.03 LTS |
4.19.90-2003.4.0.0036.oe1.x86_64 |
|
openEuler 22.03 LTS |
5.10.0-60.18.0.50.oe2203.x86_64 |
|
Kylin V10 SP1 |
4.19.90-17.ky10.x86_64 |
# For the Ubuntu OS, install the following dependencies before installing the driver: apt-get install -y net-tools pciutils # Install the NPU driver. apt-get install -y net-tools pciutils ./Ascend-hdk-xxx-npu-driver_23.0.1_linux-x86-64.run --full --install-for-all # Check whether the driver is successfully loaded. If the chip information is displayed, the driver is successfully loaded. npu-smi info # Install the NPU firmware. ./Ascend-hdk-xxx-npu-firmware_7.1.0.4.220.run --full # Restart the OS. reboot
Upgrading the MCU
# Query the NPU ID (device ID of the card). npu-smi info -l # Upgrade the MCU. npu-smi upgrade -t mcu -i NPU ID -f Ascend-hdk-xxx-mcu_23.2.3.hpm # Make the new version take effect. npu-smi upgrade -a mcu -i NPU ID # After the new version takes effect, wait for 30 seconds before checking the MCU version to ensure that the upgrade is successful. npu-smi upgrade -b mcu -i NPU ID
Installation on a PM
- Install dependencies.Ubuntu 20.04 is used as an example to describe how to install dependencies. For details about how to install dependencies on other OSs, see Dependency Installation.
# Install the OS dependencies. apt-get install -y gcc g++ make cmake zlib1g zlib1g-dev openssl libsqlite3-dev libssl-dev libffi-dev unzip pciutils net-tools libblas-dev gfortran libblas3 libopenblas-dev # Install Python3 dependencies. apt-get install -y python3-pip pip3 install --upgrade pip pip3 install attrs cython numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py
- Install CANN.
# Run the following commands in the /home directory to install Toolkit and kernels: ./Ascend-cann-toolkit_7.0.1_linux-x86_64.run --install --install-for-all --quiet ./Ascend-cann-kernels-xxx_7.0.1_linux.run --install --install-for-all --quiet # Run the following command to configure environment variables. To make the environment variables take effect permanently, add the following line to the end of the ~/.bashrc file and run the source ~/.bashrc command. source /usr/local/Ascend/ascend-toolkit/set_env.sh
- (Optional) Run the sample.
After the software is installed, you can run the sample to check whether the environment is available.
Obtain the model and weight files.
- ResNet-50 network model file (*.prototxt): Click here to download the file.
- ResNet-50 weight file (*.caffemodel): Click here to download the file.
# Obtain the sample repository code from any directory on the server (the /home directory is used as an example). apt-get install git git config --global http.sslVerify "false" git clone https://gitee.com/ascend/samples.git # Go to the sample directory. In the following sections, the sample directory refers to the samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification directory. cd samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification # Obtain the original ResNet-50 model. pip3 install Pillow mkdir -p caffe_model # Upload the obtained model and weight files to the created caffe_model directory. # Run the following commands in sample directory to convert the original ResNet-50 model to an offline model (.om file) that adapts to the Ascend AI Processor. atc --model=caffe_model/resnet50.prototxt --weight=caffe_model/resnet50.caffemodel --framework=0 --output=model/resnet50 --soc_version=Ascendxxx --input_format=NCHW --input_fp16_nodes=data --output_type=FP32 --out_nodes=prob:0 # Prepare sample images in sample directory/data. If the data directory does not exist, run the mkdir -p data command in sample directory to create it. cd data wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog1_1024_683.jpg --no-check-certificate wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog2_1024_683.jpg --no-check-certificate python3 ../script/transferPic.py # Compilation and running # Run the following commands in example directory to configure environment variables. export DDK_PATH=/usr/local/Ascend/ascend-toolkit/latest export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub # Run the following commands in sample directory to compile the executable file. mkdir -p build/intermediates/host cd build/intermediates/host cmake ../../../src -DCMAKE_CXX_COMPILER=g++ -DCMAKE_SKIP_RPATH=TRUE make # In sample directory/out, run the following command to run the compiled file. ./main
Installing the Container
- Before pulling a container image from Ascend Hub, ensure that the installation environment can be connected to the network.
- Ensure that the Docker has been installed on the host machine. (You can run the docker version command to check.) If it is not installed, refer to Deploying the Docker to install it.
- Click the inference container image link and download the container image according to Version Mapping.
- Obtain the images.
- Click the login user name and select the image download credential from the drop-down list.
- Click the Image Versions tab, download the container image corresponding to your software version, and click Download.
- Copy the command for obtaining the permission as prompted, run the command on the host, and enter the image download credential.
- Copy the image download command and run the command on the host to pull the image.If an error similar to the following occurs when you configure the permission to log in to Ascend Hub for image download, perform the following operations to rectify the fault:
Error response from daemon: Get https://ascendhub.huawei.com/v2/: x509: certificate signed by unknown authority
Run the vi /etc/docker/daemon.json command and add the Ascend Hub address to insecure-registries in the file, as shown in the following information in bold:
{ "registry-mirrors": ["http://docker.mirrors.ustc.edu.cn"], "insecure-registries": ["docker.mirrors.ustc.edu.cn", "ascendhub-registry.rnd.huawei.com", "registry.docker-cn.com", "ustc-edu-cn.mirror.aliyuncs.com","ascendhub.huawei.com"], "experimental" : true }
After adding the content, run the following commands to restart the Docker:systemctl daemon-reload systemctl restart docker
- Click Image Overview and follow the procedure in How to Use an Image to start the container.
You can run the ls /dev/ | grep davinci* command to query the available NPUs (for example, /dev/davinci0) on the host and mount the available NPUs to the container when starting the container.