Uniform Quantization
This topic describes how to perform uniform quantization on a Caffe network by using the quantization script.
Quantizing an Image Classification Network
Prerequisites
Model
Upload the Caffe model to quantize and its weight file to any directory on the Linux server as the AMCT installation user. This section uses the image classification network ResNet-50 available in the sample package as an example. Download the model file in advance.
python3.7.5 download_prototxt.py --caffe_dir CAFFE_DIR --close_certificate_verify
For details about the available command-line options, see Table 3-5.
Option |
Description |
---|---|
--h |
(Optional) Displays help information. |
--caffe_dir CAFFE_DIR |
(Required) Sets the directory of the Caffe source code, which can be relative or absolute. |
--close_certificate_verify |
(Optional) Disables certificate validation to ensure successful download. If a message is displayed indicating that authentication fails during model download, include this option to your download command and try again. |
An example is as follows:
python3.7.5 download_prototxt.py --caffe_dir caffe-master --close_certificate_verify
If messages similar to the following are displayed, the model file is successfully downloaded:
[INFO]Download 'ResNet-50-deploy.prototxt' to '$HOME/amct/amct_caffe/sample/resnet50/pre_model/ResNet-50-deploy.prototxt' success. [INFO]Download 'ResNet-50_retrain.prototxt' to '$HOME/amct/amct_caffe/sample/resnet50/pre_model/ResNet-50_retrain.prototxt' success.
You can view the downloaded model in the sample/resnet50/pre_model directory as prompted. ResNet-50_retrain.prototxt is the model file used in the retrain scenario. For details, see Retrain-based Quantization.
Image Dataset
After the model is quantized using the AMCT, perform inference with the model to test the model accuracy. Use the dataset that matches the model.
Upload the dataset matching the model to any directory on the Linux server as the AMCT installation user. This section uses the images dataset corresponding to the ResNet-50 network in the sample package as an example.
Calibration Dataset
The calibration dataset is used to generate the quantization factors to guarantee the accuracy.
The process of calculating the quantization factors is referred to as calibration. Perform inference using the quantized network with one or more batches of a subset of images from the validation dataset to complete calibration. To ensure the quantization accuracy, the source of the calibration dataset must be the same as that of the validation dataset.
Upload the calibration dataset file to any directory on the Linux server as the AMCT installation user.
Quantization Example
There are two quantization scripts. One is to use the ResNet50_sample.py quantization script that allows much flexibility. The other is to use the encapsulated script run_resnet50_with_arq.sh where only a few parameters need to be configured. Select the script that best suits your requirement.
- Run the quantization script.
- ResNet50_sample.py quantization script
- Precheck the original network to test if it can run properly in the Caffe environment.
This step is added to identify risks in advance such as dataset and model mismatch and model execution failures in the Caffe environment.
Run the following command in the directory of the quantization script to precheck the ResNet-50 network:python3.7.5 ResNet50_sample.py --model_file MODEL_FILE --weights_file WEIGHTS_FILE [--gpu GPU_ID] [--cpu][--iterations ITERATIONS] --caffe_dir CAFFE_DIR [--pre_test]
Table 3-6 describes the command-line options.Table 3-6 Command-line optionsOption
Description
--h
(Optional) Displays help information.
--model_file MODEL_FILE
(Required) Sets the directory of the Caffe model file (.prototxt).
--weights_file WEIGHTS_FILE
(Required) Sets the directory of the Caffe weight file (.caffemodel).
--gpu GPU_ID
(Optional) Sets the ID of the compute GPU device.
NOTE:In GPU inference mode, compile the Caffe environment of the GPU version before running the quantization script.
--cpu
(Optional) Enables the CPU inference mode.
[--gpu GPU_ID] and [--cpu] are mutually exclusive. The default value is [--cpu].
--iterations ITERATIONS
(Optional) Sets the batch count for inference.
--caffe_dir CAFFE_DIR
(Required) Sets the directory of the Caffe source code, which can be relative or absolute.
--pre_test
(Optional) Pre-checks the original model and provides the inference result if it can run properly in the Caffe environment.
--cfg_define CFG_DEFINE
(Optional) Sets the configuration file path.
Required for non-uniform quantization.
--benchmark
(Optional) Uses the benchmark ImageNet dataset for quantization.
Required for model accuracy analysis.
--dataset DATASET
(Optional) Sets the directory of the ImageNet dataset in LMDB format.
Required for model accuracy analysis.
An example is as follows.
python3.7.5 ResNet50_sample.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --gpu 0 --caffe_dir caffe-master --pre_test
If a message similar to the following is displayed, the original model runs properly in the Caffe environment:
[AMCT][INFO]Run ResNet-50 without quantize success!
- Run the quantization script to quantize the original network.
python3.7.5 ResNet50_sample.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --gpu 0 --caffe_dir caffe-master
If messages similar to the following are displayed, the model is successfully quantized. (The top 1 and top 5 inference accuracy results are examples only.)******final top1:0.86875 ******final top5:0.95 //Top 1 and top 5 inference accuracy results of the quantized fake_quant model in the Caffe environment. [AMCT][INFO]Run ResNet-50 with quantize success!
When quantizing an original third-party network on GPU, if a message is displayed indicating that the GPU resources are insufficient, as shown in the following figure, take the following steps to fix the problem:
- Use a GPU with larger Video RAM.
- Check if there are other processes sharing the GPU resources. If yes, wait until the GPU resources are idle.
- If the memory is sufficient, switch to the CPU mode.
- Precheck the original network to test if it can run properly in the Caffe environment.
- run_resnet50_with_arq.sh quantization encapsulation script
The run_resnet50_with_arq.sh script in the sample/resnet50/scripts directory has encapsulated the ResNet50_sample.py quantization script and minimized the parameters to be configured.
Run the following command in the sample/resnet50 directory:bash scripts/run_resnet50_with_arq.sh -c your_caffe_dir -g gpu_id
The command-line options are described as follows.
Table 3-7 Command-line optionsOption
Description
-c
(Required) Caffe-master directory.
-g
(Optional) GPU device ID. If not specified, quantization runs on the CPU by default.
An example is as follows.
bash scripts/run_resnet50_with_arq.sh -c caffe-master -g 0
If messages similar to the following are displayed, the model is successfully quantized. (The top 1 and top 5 inference accuracy results are examples only.)******final top1:0.86875 ******final top5:0.95 //Top 1 and top 5 inference accuracy results of the quantized fake_quant model in the Caffe environment. [AMCT][INFO]Run ResNet-50 with quantize success!
- ResNet50_sample.py quantization script
- View the quantization result.After the quantization is complete, the inference result using the accuracy simulation model obtained after quantization is displayed. The quantization log folder amct_log, quantization result folder results, and quantization temporary folder tmp are generated in the same directory as the quantized model.
- amct_log: AMCT log folder, including the quantization log amct_caffe.log.
- tmp: quantization temporary folder, containing:
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created. If the accuracy of model inference drops significantly after quantization, you can modify the config.json file by referring to Quantization Configuration.
- modified_model.prototxt and modified_model.caffemodel: intermediate model files
- scale_offset_record.txt: file that records quantization factors. For details about the prototype definition of the file, see Quantization Factor Record File.
- results/calibration_results: quantization result folder, containing the quantized model file as well as its weight file and quantization information file ResNet50_quant.json (named after the quantized model).
- ResNet50_deploy_model.prototxt: quantized model file to be deployed on the Ascend AI Processor.
- ResNet50_deploy_weights.caffemodel: weight file of the quantized model to be deployed on the Ascend AI Processor.
- ResNet50_fake_quant_model.prototxt: quantized model file for accuracy simulation in the Caffe environment.
- ResNet50_fake_quant_weights.caffemodel: weight file of the quantized model file for accuracy simulation in the Caffe environment.
- ResNet50_quant.json: quantization information file (named after the quantized model). This file gives the node mapping between the quantized model and the original model and is used to for accuracy comparison between the quantized model and the original model.
When a model is re-quantized, the existing result files will be overwritten.
- (Optional) Convert the quantized deployable model into an offline model adapted to the Ascend AI Processor by referring to ATC Tool Instructions.
Quantization Example Using the convert_model API
Prerequisites
- For details about how to prepare the model, dataset, and calibration dataset, see Prerequisites.
- Quantization factors:
Upload the quantization factor record file to any directory on the Linux server as the AMCT installation user. The following uses the quantization factors of the ResNet-50 network available in the sample package for illustration convenience. For details about quantization factors, see Quantization Factor Record File.
Quantization Example
- Precheck the original network to test if it can run properly in the Caffe environment.
This step is added to identify risks in advance such as dataset and model mismatch and model execution failures in the Caffe environment.
Run the following command in the sample/resnet50 directory to check the ResNet-50 network:python3.7.5 convert_model.py --model_file MODEL_FILE --weights_file WEIGHTS_FILE --record_file RECORD_FILE [--gpu GPU_ID] [--cpu][--iterations ITERATIONS] --caffe_dir CAFFE_DIR [--pre_test]
Where, --record_file RECORD_FILE is a required option indicating the path of the quantization factor record file (.txt). For details about the rest options, see Table 3-6.
An example is as follows:
python3.7.5 convert_model.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --record_file pre_model/record.txt --gpu 0 --caffe_dir caffe-master --pre_test
If a message similar to the following is displayed, the original model runs properly in the Caffe environment:
[AMCT][INFO]Run ResNet-50 without quantize success!
- Run the quantization script.
python3.7.5 convert_model.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --record_file pre_model/record.txt --gpu 0 --caffe_dir caffe-master
If messages similar to the following are displayed, the model is successfully quantized. (The top 1 and top 5 inference accuracy results are examples only.)******final top1:0.86875 ******final top5:0.95625 //Top 1 and top 5 inference accuracy results of the quantized fake_quant models in the Caffe environment. [AMCT][INFO]Run ResNet-50 with quantize success!
- View the quantization result.After the quantization is complete, the inference result using the accuracy simulation model obtained after quantization is displayed. The quantization log folder amct_log and the quantization result folder convert_results are generated in the same directory as the quantized model.
- amct_log: AMCT log folder, including the quantization log amct_caffe.log.
- results/convert_results: quantization result folder, including the quantized model file as well as its weight file and quantization information file.
- ResNet50_deploy_model.prototxt: quantized model file to be deployed on the Ascend AI Processor.
- ResNet50_deploy_weights.caffemodel: weight file of the quantized model to be deployed on the Ascend AI Processor.
- ResNet50_fake_quant_model.prototxt: quantized model file for accuracy simulation in the Caffe environment.
- ResNet50_fake_quant_weights.caffemodel: weight file of the quantized model file for accuracy simulation in the Caffe environment.
- ResNet50_quant.json: quantization information file (named after the quantized model). This file gives the node mapping between the quantized model and the original model and is used to for accuracy comparison between the quantized model and the original model.
When a model is re-quantized, the existing result files will be overwritten.
Model Accuracy Analysis
Inference and quantization calibration in Quantization Example are performed based on the built-in image dataset. Therefore, the quantization result is used only to verify whether the model is successfully quantized and cannot be used to validate the model accuracy after quantization. This section describes how to compare the model accuracy before and after quantization based on the benchmark ImageNet dataset.
Download the benchmark ImageNet dataset and convert it to the LMDB format in advance.
Preparations
Download the ImageNet dataset and convert it to LMDB format by referring to the caffe-master/examples/imagenet/readme.md file of the Caffe project.
Model Accuracy Analysis
- Test the accuracy of the original model (before quantization).
Run the following command:
python3.7.5 ResNet50_sample.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --gpu 0 --caffe_dir caffe-master --benchmark --dataset /data/Datasets/imagenet/ilsvrc12_val_lmdb --pre_test
For details about the available command-line options, see Table 3-6. If a message similar to the following is displayed, the execution is successful:
******final top1:0.725 ******final top5:0.91875 [AMCT][INFO]Run ResNet-50 without quantize success!
- Test the accuracy of the quantized model.
python3.7.5 ResNet50_sample.py --model_file pre_model/ResNet-50-deploy.prototxt --weights_file pre_model/ResNet-50-model.caffemodel --gpu 0 --caffe_dir caffe-master --benchmark --dataset /data/Datasets/imagenet/ilsvrc12_val_lmdb
If messages similar to the following are displayed, the model is successfully quantized. (The top 1 and top 5 inference accuracy results are examples only.)
******final top1:0.7125 ******final top5:0.925 [AMCT][INFO]Run ResNet-50 with quantize success!
- Compare the model accuracy before and after quantization based on the tested top1 and top5 accuracy results.
Quantizing an Object Detection Network
Prerequisites
Model
For details, see Model.
The Faster R-CNN network is automatically downloaded to the local host during Environment Initialization. This document uses the downloaded Faster R-CNN as an example. You can choose to prepare a network yourself.
Image Dataset
For details, see Image Dataset.
During Environment Initialization, the preset image dataset of the Faster R-CNN network is also downloaded to the local host.
Calibration Dataset
For details, see Calibration Dataset.
Environment Initialization
Initialize the environment to obtain the network source code, model files, weight files, and datasets.
- Obtain necessary files of the object detection network.
- If the server installed with the AMCT can access to the Internet and GitHub:
Run the following script in sample/faster_rcnn in the network quantization package to initialize the environment:
bash init_env.sh arg[1] arg[2] arg[3] arg[4] arg[5]
Table 3-8 describes the arguments in the preceding command. The following is an example of the command. Replace the arguments as required.
bash init_env.sh CPU **/caffe-master/ python3.7 /usr/include/python3.7m
- If the user environment has no Internet access:
- On a server that has Internet access, download the packages from the given links and upload the packages to the sample/faster_rcnn directory:
- faster_rcnn script: available at https://github.com/rbgirshick/caffe-fast-rcnn/archive/0dcd397b29507b8314e252e850518c5695efbb83.zip Rename the downloaded zip package faster_rcnn_caffe_master.zip. The faster_rcnn Caffe project is generated after the environment is initialized.
- faster_rcnn caffe_master project: available at https://github.com/rbgirshick/py-faster-rcnn/archive/master.zip. Rename the downloaded zip package py-faster-rcnn-master.zip and execute it to generate the faster_rcnn project package.
- vgg16_faster_rcnn pre-trained model file: available at https://dl.dropboxusercontent.com/s/o6ii098bu51d139/faster_rcnn_models.tgz. The faster_rcnn pre-trained model is generated after the environment is initialized.
- VOC2007 dataset: available at http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar. The VOC2007 dataset is generated after the environment is initialized.
This dataset is used only for model accuracy test. For details, see Model Accuracy Analysis.
- Go to the sample/faster_rcnn directory and run the initialization script. The following is an example:
bash init_env.sh CPU **/caffe-master/ python3.7 /usr/include/python3.7m
The initialization script first checks whether a downloaded package exists in the current directory. If the package exists, the initialization script will not attempt to download the package from online again. Instead, the existing package is used to generate files in the corresponding directory.
- On a server that has Internet access, download the packages from the given links and upload the packages to the sample/faster_rcnn directory:
Table 3-8 Environment initialization script argumentsOption
Description
[h]
(Optional) Displays help information.
arg[1]
(Required) Sets the run mode, which can be CPU or GPU.
NOTE:- In CPU mode, only the [--cpu] argument can be used in the quantization command.
- In GPU mode, the [--gpu GPU_ID] or [--cpu] argument can be used in the quantization command.
Select the argument for environment initialization as required.
arg[2]
(Required) Sets the absolute path of caffe-master.
arg[3]
(Optional) Sets the Python 3 version. The default value is python3. If there are multiple versions, you can use this argument to specify the Python 3 version, for example, python3.7.
arg[4]
(Optional) Sets the Python3m path, which must match the Python 3 version. The default value is /usr/include/python3.7m.
arg[5]
(Optional) To perform a model test, set this argument to with_benchmark. For details, see Model Accuracy Analysis.
- If the server installed with the AMCT can access to the Internet and GitHub:
- After the environment is initialized, the following directories are generated:
- caffe_master_patch: Caffe source code patch folder. You need to manually copy the following files to the caffe-master project.
- include/caffe/fast_rcnn_layers.hpp: header file of the custom layer definition.
- src/caffe: folder of the implementation source files of the custom layer.
Run the following command:
cp -r $HOME/amct/amct_caffe/sample/faster_rcnn/caffe_master_patch/* caffe-master/
- The following folders are added to the amct_caffe_faster_rcnn_sample directory:
- datasets: folder of datasets used by Faster R-CNN.
- pre_model: folder of the Faster R-CNN model file (faster_rcnn_test.pt) and weight file (VGG16_faster_rcnn_final.caffemodel).
- python_tools: folder of Faster R-CNN source code.
- caffe_master_patch: Caffe source code patch folder. You need to manually copy the following files to the caffe-master project.
- Go back to the caffe-master directory and run the following command to recompile the Caffe environment:
make clean make all && make pycaffe
Quantization Example
- Precheck the original network to test if it can run properly in the Caffe environment.
This step is added to identify risks in advance such as dataset and model mismatch and model execution failures in the Caffe environment.
Switch to the sample/faster_rcnn/amct_caffe_faster_rcnn_sample directory and run the following command to detect the faster_rcnn network:
python3.7.5 faster_rcnn_sample.py --model_file MODEL_FILE --weights_file WEIGHTS_FILE [--gpu GPU_ID] [--cpu][--iterations ITERATIONS] [--pre_test]
Table 3-6 describes the command-line options.
An example is as follows.
python3.7.5 faster_rcnn_sample.py --model_file pre_model/faster_rcnn_test.pt --weights_file pre_model/VGG16_faster_rcnn_final.caffemodel --gpu 0 --pre_test
The number of detection result files is displayed based on the number of detected objects in the dataset in amct_caffe_faster_rcnn_sample/datasets. When you close the detection result file, if messages similar to the following are displayed on the server where the AMCT is located, the original model is running properly in the Caffe environment.
[AMCT][INFO]Run faster_rcnn without quantize success!
The pre-check result file is stored in amct_caffe_faster_rcnn_sample/pre_detect_results/.
- Run the quantization script.
python3.7.5 faster_rcnn_sample.py --model_file pre_model/faster_rcnn_test.pt --weights_file pre_model/VGG16_faster_rcnn_final.caffemodel --gpu 0
The number of detection result files is displayed based on the number of detected objects in the dataset in amct_caffe_faster_rcnn_sample/datasets. You can compare the detection box position on the image with the inference result of the original model after the [--pre_test] parameter is used.
Close all detection result files. A quantization success message is displayed on the AMCT server:
[AMCT][INFO]Run faster_rcnn with quantize success!
The post-check result file is stored in amct_caffe_faster_rcnn_sample/quant_detect_results/.
- View the quantization result.After the quantization is complete, the inference result using the accuracy simulation model obtained after quantization is displayed. The quantization configuration file config.json, the quantization log folder amct_log, the quantization result folder results, and the quantization temporary folder tmp are generated in the same directory as the quantized model.
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created.
If the accuracy of model inference drops significantly after quantization, you can modify the config.json file by referring to Quantization Configuration.
- amct_log: AMCT log folder, including the quantization log amct_caffe.log.
- pre_detect_results: pre-check result folder.
- quant_detect_results: post-check result folder.
- tmp: quantization temporary folder, including the temporary model files modified_model.prototxt and modified_model.caffemodel, and the quantization factor file (scale_offset_record/record.txt. See Quantization Factor Record File for the prototype definition of this file).
- results: quantization result folder, containing the quantized model file and its weight file as well as the quantization information file.
- faster_rcnn_deploy_model.prototxt: quantized model file to be deployed on the Ascend AI Processor.
- faster_rcnn_deploy_weights.caffemodel: weight file of the quantized model to be deployed on the Ascend AI Processor.
- faster_rcnn_fake_quant_modelmodel.prototxt: quantized model file for accuracy simulation in the Caffe environment.
- faster_rcnn_fake_quant_weights.caffemodel: weight file of the quantized model file for accuracy simulation in the Caffe environment.
- faster_rcnn_quant.json: quantization information file (named after the quantized model). This file gives the node mapping between the quantized model and the original model and is used to for accuracy comparison between the quantized model and the original model.
When a model is re-quantized, the existing result files will be overwritten.
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created.
- (Optional) Convert the quantized deployable model into an offline model adapted to the Ascend AI Processor by referring to ATC Tool Instructions.
Model Accuracy Analysis
Inference and quantization calibration in Quantization Example are performed based on the built-in image dataset. Therefore, the quantization result is used only to verify whether the model is successfully quantized and cannot be used to validate the model accuracy after quantization. This section describes how to compare the model accuracy before and after quantization based on the benchmark VOC2007 dataset.
Add the with_benchmark argument during environment initialization to download the benchmark VOC2007 dataset.
Preparations
Run the following command to initialize the environment information for downloading the VOC2007 benchmark dataset:
bash init_env.sh CPU **/caffe-master with_benchmark Alternatively, bash init_env.sh CPU **/caffe-master python3.7.5 /usr/include/python3.7m with_benchmark
After the environment is initialized, the VOCdevkit dataset file is generated in the amct_caffe_faster_rcnn_sample/datasets directory in addition to the files regenerated in Environment Initialization.
If the with_benchmark argument is added during environment initialization, all subsequent quantization operations are performed based on the benchmark VOC2007 dataset.
- In CPU mode, only the [--cpu] argument can be used in the quantization command.
- In GPU mode, the [--gpu GPU_ID] or [--cpu] argument can be used in the quantization command.
Select the argument for environment initialization as required.
Model Accuracy Analysis
- Test the accuracy of the original model (before quantization).
Run the following command:
python3.7.5 faster_rcnn_sample.py --model_file pre_model/faster_rcnn_test.pt --weights_file pre_model/VGG16_faster_rcnn_final.caffemodel --gpu 0 --pre_test
For details about the available command-line options, see Table 3-6. If a message similar to the following is displayed, the execution is successful:
[AMCT][INFO]Run faster_rcnn without quantize success, and mAP is 0.8812724482290413
- Test the accuracy of the quantized model.
python3.7.5 faster_rcnn_sample.py --model_file pre_model/faster_rcnn_test.pt --weights_file pre_model/VGG16_faster_rcnn_final.caffemodel --gpu 0
If a message similar to the following is displayed, the model is successfully quantized. The mean average precision (mAP) result is an example only.
[AMCT][INFO]Run faster_rcnn with quantize success, and mAP is 0.8796338534980108!
- Compare the model accuracy before and after quantization based on the tested mAP results.
Quantizing the MINST Network
This model is used to quickly verify the AMCT quantization functionality. Inference and quantization calibration are performed based on the benchmark MNIST dataset. Compare the model accuracy before and after quantization based on the tested accuracy results.
Prerequisites
Model
This section uses the LSTM MNIST network available in the sample package as an example.
Image Dataset
- If the server installed with the AMCT has Internet access:
Create the mnist_data directory in sample/mnist/ on the server, switch to the directory, and run the following commands to obtain image dataset and label files:
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
- If the user environment has no Internet access:
On a server with Internet access, download the corresponding software packages from the following links and upload them to the sample/mnist/mnist_data directory:
- MNSIT dataset: available at http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz. The MNSIT test dataset t10k-images-idx3-ubyte is generated in the mnist/mnist_data/ directory after the environment is initialized.
- MNSIT label file: available at http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz. The MNSIT test label t10k-labels-idx1-ubyte is generated in the mnist/mnist_data/ directory after the environment is initialized.
Since it MNSIT is a basic network, the environment initialization procedure and the quantization procedure are combined. The MNSIT dataset in LMDB format is automatically generated in the mnist/mnist_test_lmdb directory based on the preceding files in Quantization Example phase.
Calibration Dataset
For details, see Calibration Dataset.
Quantization Example
Quantization Example
- Go to the sample/mnist directory and run the following command to quantize the MNIST network:
python3.7.5 mnist_sample.py --model_file pre_model/mnist-deploy.prototxt --weights_file pre_model/mnist-model.caffemodel --gpu 0 --caffe_dir caffe-master
For details about the available command-line options, see Table 3-6.
If messages similar to the following are displayed, the model is successfully quantized. The inference accuracy results are examples only.
******final top1:0.9853125 //Inference accuracy of the quantized fake_quant model in the Caffe environment [AMCT][INFO] mnist top1 before quantize is 0.98515625, after quantize is 0.9853125 //Accuracy test results [AMCT][INFO]Run mnist sample with quantize success!
- After the quantization is successful, the inference results of the accuracy simulation model after quantization and the accuracy test results before and after quantization are displayed. The quantization configuration file config.json, the quantization log folder amct_log, the quantization result folder results, and the quantization temporary folder tmp are generated in the same directory as the quantized model.
- amct_log: AMCT log folder, including the quantization log amct_caffe.log.
- tmp: quantization temporary folder, containing:
- config.json: configuration file that describes how to quantize each layer in the model. If a quantization configuration file already exists in the directory of the quantization script, when the create_quant_config API is called again, the existing quantization configuration file is overwritten if the new quantization configuration file has the same name as that of the existing one. Otherwise, a new quantization configuration file is created. If the accuracy of model inference drops significantly after quantization, you can modify the config.json file by referring to Quantization Configuration.
- modified_model.prototxt and modified_model.caffemodel: intermediate model files
- record.txt: file that records quantization factors. For details about the prototype definition of the file, see Quantization Factor Record File.
- mnist_data and mnist_test_lmdb: dataset folders
- results: quantization result folder, containing the quantized model file and its weight file as well as the quantization information file.
- mnist_deploy_model.prototxt: quantized model file to be deployed on the Ascend AI Processor.
- mnist_deploy_weights.caffemodel: weight file of the quantized model to be deployed on the Ascend AI Processor.
- mnist_fake_quant_model.prototxt: quantized model file for accuracy simulation in the Caffe environment.
- mnist_fake_quant_weights.caffemodel: weight file of the quantized model file for accuracy simulation in the Caffe environment.
- mnist_quant.json: quantization information file (named after the quantized model). This file gives the node mapping between the quantized model and the original model and is used to for accuracy comparison between the quantized model and the original model.
When a model is re-quantized, the existing result files will be overwritten.