Quantization Factor Record File
Prototype
The quantization factor record file is a serialized data structure file based on the protobuf protocol. You can generate a quantized model file by using the quantization configuration file, original network model file, and the quantization factor record file. The protobuf prototype is defined as follows.
message SingleLayerRecord { optional float scale_d = 1; optional int32 offset_d = 2; repeated float scale_w = 3; repeated int32 offset_w = 4; repeated uint32 shift_bit = 5; optional uint32 channels = 6; optional uint32 height = 7; optional uint32 width = 8; optional bool skip_fusion = 9 [default = false]; } message ScaleOffsetRecord { message MapFiledEntry { optional string key = 1; optional SingleLayerRecord value = 2; } repeated MapFiledEntry record = 1; }
The parameters are described as follows.
Message |
Required/Optional |
Type |
Field |
Description |
---|---|---|---|---|
ScaleOffsetRecord |
- |
- |
- |
Map structure. To ensure compatibility, the discrete map structure is used. |
Repeated |
MapFiledEntry |
record |
Each records a quantization factor of a quantization layer and consists of two members:
|
|
SingleLayerRecord |
- |
- |
- |
Quantization factors for quantization. |
Optional |
float |
scale_d |
Scale factor for data quantization. Only unified data quantization is supported. |
|
Optional |
int32 |
offset_d |
Offset factor for data quantization. Only unified data quantization is supported. |
|
Repeated |
float |
scale_w |
Scale factor for weight quantization. Scalar mode (quantizing the weight of the current layer in a unified manner) and vector mode (quantizing the weight of the current layer in channel-wise mode) are supported. Only Convolution and Deconvolution layers support channel-wise quantization. |
|
Repeated |
int32 |
offset_w |
Offset factor for weight quantization. Similar to scale_w, it also supports the scalar and vector quantization modes and the dimension configuration must be the same as that of scale_w. Currently weight quantization with offset is not supported, and offset_w must be 0. |
|
Repeated |
uint32 |
shift_bit |
Shift factor. Reserved for the convert_model API. |
|
Optional |
uint32 |
channels |
Network-wide infer_shape is not supported. Therefore, the input shape information of the current layer needs to be configured. This field is used to configure the size of the input channel dimension. |
|
Optional |
uint32 |
height |
Network-wide infer_shape is not supported. Therefore, the input shape information of the current layer needs to be configured. This field is used to configure the size of the input height dimension. |
|
Optional |
uint32 |
width |
Network-wide infer_shape is not supported. Therefore, the input shape information of the current layer needs to be configured. This field is used to configure the size of the input width dimension. |
|
Optional |
bool |
skip_fusion |
Whether to skip Conv+BN+Scale fusion, Deconv+BN+Scale fusion, BN+Scale+Conv fusion, and FC+BN+Scale fusion at the current layer. Defaults to false, indicating that fusion of the preceding types is performed. |
The protobuf protocol does not report an error for repeated settings of optional fields. Instead, the most recent settings are used.
For general quantization layers, seven parameters need to be configured, including scale_d, offset_d, scale_w, offset_w, channels, height, width, and shift_bit. The scale_w and offset_w parameters are unavailable for AVE Pooling since the layer has no weight. An example of the quantization factor record file is as follows:
record { key: "conv1" value: { scale_d: 0.01424 offset_d: -128 scale_w: 0.43213 scale_w: 0.78163 scale_w: 1.03213 offset_w: 0 offset_w: 0 offset_w: 0 shift_bit: 1 shift_bit: 1 shift_bit: 1 channels:3 height: 144 width: 144 skip_fusion: true } } record { key: "pool1" value: { scale_d: 0.532532 offset_d: 13 channels:256 height: 32 width: 32 } } record { key: "fc1" value: { scale_d: 0.37532 offset_d: -67 scale_w: 0.876221 offset_w: 0 shift_bit: 1 channels:1024 height: 1 width: 1 } }
Quantization Factors
The scale and offset quantization factors need to be provided for data and weight quantization. The AMCT uses a unified quantization data structure. See the following expression.
The value ranges are as follows:
Quantization is classified into two types: quantization without offset and quantization with offset.
- Principles of the quantization algorithm without offset:
The original high-precision data and quantized int8 data are converted into
, where scale is a float32. To indicate positive and negative numbers, the signed int8 data type is used for
. The following describes how to convert the original data into int8 format. round is a rounding function. The value to be determined by the quantization algorithm is the constant scale.
Quantization of the weight and the data may be summarized as a process of searching for a scale. Because
is a signed number, to ensure symmetry of the ranges represented by positive and negative values, an absolute value operation is first performed on all data, so that the range of the to-be-quantized data is changed to
, and then the scale is determined. The range of positive int8 values is [0, 127]. Therefore, scale can be computed as follows:
Therefore, the range of the int8 values is
. Data beyond the range
is saturated to a boundary value, and then the quantization operation shown in the formula is performed.
- Principles of the quantization algorithm with offset:
The difference between the solution without offset and the solution without offset lies in the data conversion mode. The constants scale and offset also need to be determined.
The uint8 data is converted through calculation based on the original high-accuracy data, which is shown in the following formula:
scale is an fp32,
is an unsigned int8, and offset is an int32. The data range is
. If a value range of the to-be-quantized data is
, scale and offset are computed as follows:
,
- Unified quantization data format:
By performing simple data conversion on the quantization formula with offset, data of the quantized data and the quantization algorithm without the offset is in int format. The specific conversion process is as follows:
int8 quantization is used as an example for description. The input original floating-point data is
, the original quantized fixed-point number is
, the quantization scale is scale, and the quantization offset is
(the algorithm requires zero crossing to avoid too much accuracy drop). The calculation principle of quantization is as follows:
Where,
. Through the foregoing conversion, the data may also be converted into the int8 format. After scale and the converted offset' are determined, the int8 data converted from the original floating-point data is as follows: