Quantizing an Object Detection Network
Prerequisites
- Model
Upload the TensorFlow model to be quantized to any directory on the Linux server as the AMCT installation user. This section uses the yolov3/pre_model/yolov3_coco.pb model available in the sample package as an example.
If you choose to use your own model, you are advised to perform inference in the TensorFlow environment in advance to test if it can run properly in the TensorFlow environment with expected accuracy.
- Image dataset
After the model is quantized using the AMCT, perform inference with the model to test the model accuracy. Use the dataset that matches the model.
Upload the dataset matching the model to any directory on the Linux server as the AMCT installation user. This section uses the yolov3/detection.jpg dataset available in the sample package as an example.
- Calibration dataset
The calibration dataset is used to generate the quantization factors to guarantee the accuracy.
The process of calculating the quantization factors is referred to as calibration. Perform inference using the quantized network with one or more batches of a subset of images from the validation dataset to complete calibration.
Upload the calibration dataset file to any directory on the Linux server as the AMCT installation user. This section uses the yolov3/calibration.jpg calibration dataset available in the sample package as an example.
Quantization Example
- Go to the sample/yolov3 directory and run the following command to quantize the YOLOv3 network:
python3.7.5 YOLOV3_sample.py
If messages similar to the following are displayed, the quantization is successful:
INFO - [AMCT]:[save_model]: The model is saved in $HOME/amct/amct_tf/sample/yolov3/result/YOLOv3_quantized.pb //Directory of the quantized model origin.png save successfully! //Pre-check result quantize.png save successfully! //Post-check result
- After the quantization is complete, the quantization configuration file config.json, quantization log folder amct_log, quantization result folder result, and pre-check result origin.png and post-check result detection.jpg are generated in the same directory as the quantized model.
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created.
If the accuracy of model inference drops significantly after quantization, you can modify the config.json file by referring to Quantization Configuration.
- amct_log: AMCT log folder, including the quantization log amct_tensorflow.log.
- record.txt: file that records quantization factors. For details about the prototype definition of the file, see Quantization Factor Record File.
- result: quantization result folder, containing the quantized model file and the quantization information file.
- YOLOv3_quant.json: quantization information file (named after the quantized model). This file gives the node mapping between the quantized model and the original model and is used to for accuracy comparison between the quantized model and the original model.
- YOLOv3_quantized.pb: quantized model that can serve for accuracy simulation in the TensorFlow environment and be deployed on the Ascend AI Processor.
When a model is re-quantized, the existing result files will be overwritten.
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created.
- (Optional) Convert the quantized deployable model into an offline model adapted to the Ascend AI Processor by referring to ATC Tool Instructions.