Atlas 800 Inference Server (Model 3000) 23.0.0 Ascend Software Installation Guide 01
Model Inference
Preparing Model Files and Datasets
Ensure that the server is connected to the network.
- Prepare the model implementation file and weight file.
- Configure the software source. For details, see Checking the Source Validity.
- Install and configure git-lfs (Ubuntu is used as an example).
apt-get install -y git apt-get install -y git-lfs # Configure git-lfs. git lfs install
If "Git LFS initialized" is displayed, git-lfs is configured successfully.
- Download the model implementation and weight files and save them to any path (for example, /home).
git config --global http.sslVerify "false" git clone https://huggingface.co/THUDM/chatglm2-6b cd chatglm2-6b git reset --hard 4e38bef4c028beafc8fb1837462f74c02e68fcc2
- The chatglm2-6b directory is as follows:
|-- config.json |-- configuration_chatglm.py |-- modeling_chatglm.py |-- pytorch_model-00001-of-00007.bin |-- pytorch_model-00002-of-00007.bin |-- pytorch_model-00003-of-00007.bin |-- pytorch_model-00004-of-00007.bin |-- pytorch_model-00005-of-00007.bin |-- pytorch_model-00006-of-00007.bin |-- pytorch_model-00007-of-00007.bin |-- pytorch_model.bin.index.json |-- quantization.py |-- tokenization_chatglm.py |-- tokenizer_config.json |-- tokenizer.model
- Add the following information in bold to the config.json file:
{ ...... "world_size": 1, ...... }
- Obtain the quantization weight.
Contact Huawei technical support. After the file is obtained, upload it to any path (for example, /home) on the server and decompress it to obtain the quant_weight folder.
- Download the C-Eval dataset.
Click here to obtain the dataset. After obtaining the dataset, upload it to any path (for example, /home/dataset) on the server and decompress the dataset to obtain the C-Eval folder.
The C-Eval directory of the dataset is as follows:
|-- test |-- val
Model Inference
Ensure that the server is connected to the network.
- Install the third-party dependency. (/home/transformer-llm is only an example. Replace it with the actual path.)
cd /home/transformer-llm/pytorch/examples/chatglm2_6b pip3 install -r requirements.txt
- Before inference, configure the following environment variables:
export HCCL_BUFFSIZE=110 export HCCL_OP_BASE_FFTS_MODE_ENABLE=1 export TASK_QUEUE_ENABLE=1 export ATB_OPERATION_EXECUTE_ASYNC=1 export ATB_LAYER_INTERNAL_TENSOR_REUSE=1 # Performance can be improved by enabling multi-stream on Atlas 300I Pro and Atlas 300I Duo. export ATB_USE_TILING_COPY_STREAM=1
- Perform C-Eval dataset inference.Run the following commands in the /home/transformer-llm/pytorch/examples/chatglm2_6b directory:
# Set the path of the model implementation file and weight file. export CHECKPOINT=/home/chatglm2-6b # Set the path of the dataset. export DATASET=/home/dataset/CEval # Set the path of the quantization weight file. export QUANT_WEIGHT_PATH=/home/quant_weight # Single-chip quantization export ENABLE_QUANT=1 python3 generate_weights.py --model_path ${CHECKPOINT} python3 main.py --mode precision_dataset --model_path ${CHECKPOINT} --ceval_dataset ${DATASET} --batch 8 --device 0 # Dual-chip quantization (Atlas 300I Duo) export ENABLE_QUANT=1 python3 generate_weights.py --model_path ${CHECKPOINT} --tp_size 2 torchrun --nproc_per_node 2 --master_port 2000 main.py --mode precision_dataset --model_path ${CHECKPOINT} --ceval_dataset ${DATASET} --batch 8 --tp_size 2 --device 0
- Check whether error information similar to the following is displayed (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):
ImportError: /usr/local/gcc7.3.0/lib64/libgomp.so.1: cannot allocate memory in static TLS block ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 (pid: 12591) of binary: /usr/local/python3.9.2/bin/python3.
If yes, run the following command to configure environment variables (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):export LD_PRELOAD=/usr/local/gcc7.3.0/lib64/libgomp.so.1
- Check whether error information similar to the following is displayed:
ImportError: This modeling file requires the following packages that were not found in your environment: atb_speed. Run `pip install atb_speed`
If yes, run the following command (replace /home/transformer-llm with the actual model package path):cd /home/transformer-llm/pytorch/examples/atb_speed_sdk pip install .
- Check whether error information similar to the following is displayed (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):
- Test the model performance data.Run the following commands in the /home/transformer-llm/pytorch/examples/chatglm2_6b directory:
# Set the path of the model implementation file and weight file. export CHECKPOINT=/home/chatglm2-6b # Set the path of the dataset. export DATASET=/home/dataset/CEval # Set the path of the quantization weight file. export QUANT_WEIGHT_PATH=/home/quant_weight # Single-chip quantization export ENABLE_QUANT=1 python3 generate_weights.py --model_path ${CHECKPOINT} # Skip this step if it has been generated. python main.py --mode performance --model_path ${CHECKPOINT} --batch 8 --set_case_pair 1 --seqlen_in_pair 256,512,1024 --seqlen_out_pair 64,128,25 --device 0 # Dual-chip quantization (Atlas 300I Duo) export ENABLE_QUANT=1 python3 generate_weights.py --model_path ${CHECKPOINT} --tp_size 2 # Skip this step if it has been generated. torchrun --nproc_per_node 2 --master_port 2000 main.py --mode performance --model_path ${CHECKPOINT} --batch 16 --tp_size 2 --set_case_pair 1 --seqlen_in_pair 256,512,1024 --seqlen_out_pair 64,128,25 --device 0
- Check whether error information similar to the following is displayed (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):
ImportError: /usr/local/gcc7.3.0/lib64/libgomp.so.1: cannot allocate memory in static TLS block ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 (pid: 12591) of binary: /usr/local/python3.9.2/bin/python3.
If yes, run the following command to configure environment variables (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):export LD_PRELOAD=/usr/local/gcc7.3.0/lib64/libgomp.so.1
- Check whether error information similar to the following is displayed:
ImportError: This modeling file requires the following packages that were not found in your environment: atb_speed. Run `pip install atb_speed`
If yes, run the following command (replace /home/transformer-llm with the actual model package path):cd /home/transformer-llm/pytorch/examples/atb_speed_sdk pip install .
- Check whether error information similar to the following is displayed (/usr/local/gcc7.3.0/lib64/libgomp.so.1 is only an example):
Document ID:EDOC1100356041
Views:7467
Downloads:33
Average rating:0.0Points