What Do I Do If My TensorFlow Network Output Node Is Changed by AMCT?
Symptom
When the AMCT calls the quantize_model API to modify the original TensorFlow network graph, the output node at the bottom layer changes because a searchN layer has been inserted. In the quantization script, you need to replace the output node for inference with the new output node after graph modification as prompted. The AMCT quantization log provides the original output node name and new output node name after graph modification.
Consider the following scenarios about the output node change due to graph modification:
- Scenario 1: The bottom layer of the network model is ADD or ADDV2, and the data to add is one-dimensional, which meets the bias-add condition.Figure 4-6 ADD or ADDV2 as the bottom layer
- If the bottom layer is Add, messages similar to the following are displayed during graph modification:
2020-09-01 09:31:04,896 - WARNING - [AMCT]:[replace_add_pass]: Replace ADD at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'Add:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add/BiasAdd:0' 2020-09-01 09:31:04,979 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add/BiasAdd:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
- If the bottom layer is AddV2, messages similar to the following are displayed during graph modification:
2020-09-01 09:32:42,281 - WARNING - [AMCT]:[replace_add_pass]: Replace ADD at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'add:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add/BiasAdd:0' 2020-09-01 09:32:42,362 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add/BiasAdd:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
- If the bottom layer is Add, messages similar to the following are displayed during graph modification:
- Scenario 2: The bottom layer of the network is BiasAdd, and its upstream layer is Conv2D, DepthwiseConv2dNative, Conv2DBackpropInput, or MatMul.Figure 4-7 BiasAdd as the bottom layer, whose upstream layer is Conv2D
Messages similar to the following are displayed during graph modification:
2020-09-01 09:39:26,130 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'BiasAdd:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
- Scenario 3: The bottom layer of the network is Conv2D, DepthwiseConv2dNative, Conv2DBackpropInput, MatMul, or AvgPool.Figure 4-8 Conv2D as the bottom layer
Messages similar to the following are displayed during graph modification:
2020-09-01 09:40:28,717 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'Conv2D:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
- Scenario 4: The bottom layer of the network is FusedBatchNorm, FusedBatchNormV2, or FusedBatchNormV3, and its upstream layer is Conv2D+(BiasAdd) or DepthwiseConv2dNative+(BiasAdd).Figure 4-9 FusedBatchNormV3 as the bottom layer
Messages similar to the following are displayed during graph modification:
2020-09-01 09:44:08,637 - WARNING - [AMCT]:[conv_bn_fusion_pass]: Fused BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/FusedBatchNormV3:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add:0' 2020-09-01 09:44:08,717 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
- Scenario 5: The bottom layer of the network uses the BN small operator structure, whose input is 4-dimensional.Figure 4-10 BN small operator structure as the bottom layer
Messages similar to the following are displayed during graph modification:
2020-09-01 09:46:46,426 - WARNING - [AMCT]:[replace_bn_pass]: Replace BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/batchnorm/add_1:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'batch_normalization/batchnorm/bn_replace/batch_normalization/FusedBatchNormV3:0' 2020-09-01 09:46:46,439 - WARNING - [AMCT]:[conv_bn_fusion_pass]: Fused BN at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'batch_normalization/batchnorm/bn_replace/batch_normalization/FusedBatchNormV3:0' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'bias_add:0' 2020-09-01 09:46:46,518 - WARNING - [AMCT]:[quantize_model]: Insert searchN operator at the end of the network! You need to replace the old output node by the new output node in inference process! >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>The name of the old output node is 'bias_add:0' //Original output node <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<The name of the new output node is 'search_n_quant/search_n_quant_SEARCHN/Identity:0' //New output node
Script Modification
If the quantize_model API is called to modify the original TensorFlow network graph, the output node at the bottom layer changes because a searchN layer is inserted at the end of the network. In this case, you need to modify the quantization script based on the log to replace the output node during network inference with the new node name as follows:
Quantization script before modification (The following script is only an example.)
import tensorflow as tf import amct_tensorflow as amct def load_pb(model_name): with tf.gfile.GFile(model_name, "rb") as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, name='') def main(): # Name of the network .pb file model_name = './pb_model/case_1_1.pb' # Name of the output node in network quantization inference infer_output_name = 'Add:0' # Name of the output node of the quantized model save_output_name = 'Add:0' # Load the .pb file of the network. load_pb(model_name) # Obtain the network graph. graph = tf.get_default_graph() # Create a quantization configuration file. amct.create_quant_config( config_file='./configs/config.json', graph=graph) # Insert quantization operators. amct.quantize_model( graph=graph, config_file='./configs/config.json', record_file='./configs/record_scale_offset.txt') # Start network inference. with tf.Session() as sess: output_tensor = graph.get_tensor_by_name(infer_output_name) sess.run(tf.global_variables_initializer()) sess.run(output_tensor) # Save the quantized .pb model file. amct.save_model( pb_model=model_name, outputs=[save_output_name[:-2]], record_file='./configs/record_scale_offset.txt', save_path='./pb_model/case_1_1') if __name__ == '__main__': main()
The modified quantization script is as follows.
import tensorflow as tf import amct_tensorflow as amct def load_pb(model_name): with tf.gfile.GFile(model_name, "rb") as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) tf.import_graph_def(graph_def, name='') def main(): # Name of the network .pb file model_name = './pb_model/case_1_1.pb' # Name of the output node in quantization inference, which needs to be replaced with the new node name printed in the log. infer_output_name = 'search_n_quant/search_n_quant_SEARCHN/Identity:0' # Name of the output node of the quantized model save_output_name = 'Add:0' # Load the .pb file of the network. load_pb(model_name) # Obtain the network graph. graph = tf.get_default_graph() # Create a quantization configuration file. amct.create_quant_config( config_file='./configs/config.json', graph=graph) # Insert quantization operators. amct.quantize_model( graph=graph, config_file='./configs/config.json', record_file='./configs/record_scale_offset.txt') # Start network inference. with tf.Session() as sess: output_tensor = graph.get_tensor_by_name(infer_output_name) sess.run(tf.global_variables_initializer()) sess.run(output_tensor) # Save the quantized .pb model file. amct.save_model( pb_model=model_name, outputs=[save_output_name[:-2]], record_file='./configs/record_scale_offset.txt', save_path='./pb_model/case_1_1') if __name__ == '__main__': main()