create_quant_config
Description
Finds all quantization-capable layers in a graph and writes the quantization configuration information of the quantization-capable layers to the configuration file. Currently, the following layers support quantization: MatMul, Conv2D (dilation = 1), DepthwiseConv2dNative (dilation = 1), Conv2DBackpropInput (dilation = 1), and AVE Pooling.
Prototype
create_quant_config(config_file, graph, skip_layers=None, batch_num=1, activation_offset=True, config_defination=None)
Parameters
Parameter |
Input/Return |
Description |
Restrictions |
---|---|---|---|
config_file |
Input |
Directory of the quantization configuration file, including the file name. The existing file (if available) in the directory will be overwritten upon this API call. |
A string. |
graph |
Input |
A tf.Graph of the model to be quantized. |
A tf.Graph. |
skip_layers |
Input |
Quantization-capable layers in the tf.Graph to skip. |
Default: None A list of strings. |
batch_num |
Input |
Number of batches taken to generate the quantization factors. |
An int. Value range: any integer larger than 0. Default: 1 batch_num cannot be too large. The product of batch_num and batch_size equals to the number of images used during quantization. Too many images consume too much memory. |
activation_offset |
Input |
Whether to quantize data with offset. |
Default: true A bool. |
config_defination |
Input |
Whether to create a simplified quantization configuration file quant.cfg from the calibration_config.proto file in /amct_tensorflow/proto/calibration_config.proto in the AMCT installation path. For details about the available options in the calibration_config.proto file and the generated simplified quantization configuration file quant.cfg, see Simplified Quantization Configuration File. |
Default: None A string. If set to None, a configuration file is generated based on the residual arguments (skip_layers, batch_num, and activation_offset). Otherwise, a configuration file in JSON format is generated based on this argument. |
Returns
None
Outputs
Outputs a quantization configuration file in JSON format. (When quantization is performed again, the existing quantization configuration file output by this API call will be overwritten.)
For example, if only layer_name1 and layer_name2 in a graph support quantization, the quantization configuration file generated by the create_quant_config(config_file, graph, skip_layers=None, batch_num=30, activation_offset=True) call is as follows.
{ "version": 1, "batch_num": 30, "activation_offset": true, "do_fusion":true, "skip_fusion_layers":[], "layer_name1":{ "quant_enable": false, "activation_quant_params":[ { "max_percentile":0.999999, "min_percentile":0.999999, "search_range":[0.7, 1.3], "search_step":0.01 } ], "weight_quant_params":[ { "channel_wise":true } ] }, "layer_name2":{ "quant_enable": true, "activation_quant_params":[ { "max_percentile":0.999999, "min_percentile":0.999999, "search_range":[0.7, 1.3], "search_step":0.01 } ], "weight_quant_params":[ { "channel_wise":true } ] } }
Example
import amct_tensorflow as amct # Create a graph of the network to be quantized. network = build_network() # Create a quantization configuration file. amct.create_quant_config(config_file="./configs/config.json", graph=tf.get_default_graph(), skip_layers=None, batch_num=1, activation_offset=True)