Advanced Functionality
Feature Options
--out_nodes
Description
Sets the output nodes.
If the output nodes (operator names) are not specified, the last operator of the model is output by default. If specified, information of the specified operators is output.
To check the parameters of a specific layer, mark the layer so that this layer is specified as the output node. After the model is converted, you can view the parameter settings of the specified operator at the end of the .om model file or the .json file converted from the .om model file.layer
See Also
None
Argument
Argument:
- Node names (node_name) in the network model.
Enclose all output nodes in double quotation marks ("") and separate the nodes with semicolons (;). node_name must be the node name in the original model. The number after the colon indicates the output index. For example, node_name1:0 indicates output 0 of the node named node_name1.
- Top names (topname) of a layer.
Enclose all output nodes in double quotation marks ("") and separate the nodes with semicolons (;). The argument must be top names of a layer on the Caffe network before model build. If multiple layers have the same topname, the topname of the last layer is used.
Restrictions:
- The passed argument is either output nodes or top names.
- Only Caffe networks support top name arguments.
- If a selected operator is fused during model conversion, the operator cannot be specified as the output node.
Recommended Configurations and Benefits
None
Example
- Node names (node_name) in the network model
--out_nodes="node_name1:0;node_name1:1;node_name2:0"
- Top names (topname) of a layer
--out_nodes="topname0;topname1;topname2"
Dependencies and Restrictions
None
--input_fp16_nodes
Description
Sets the name of the FP16 input node.
See Also
This option is mutually exclusive with --insert_op_conf.
Argument
Argument: Name of the FP16 input node.
Restrictions: The specified nodes must be enclosed in double quotation marks ("") and separated by semicolons (;).
Recommended Configurations and Benefits
None
Example
--input_fp16_nodes="node_name1;node_name2"
Dependencies and Restrictions
None
--insert_op_conf
Description
Sets the configuration file directory (including the file name) of an operator to be inserted, for example, the aipp operator for data preprocessing.
See Also
This option is mutually exclusive with --input_fp16_nodes.
Argument
Argument: Directory (including the name) of the configuration file of the operator to be inserted.
Format: The directory and file name can contain letters, digits, underscores (_), hyphens (-), and periods (.).
Recommended Configurations and Benefits
None
Example
The following uses the AIPP preprocessing operator as an example. The content of the configuration file is as follows (insert_op.cfg is used as an example file name). For details about the configuration of the AIPP preprocessing configuration file, see AIPP Configuration.
aipp_op { aipp_mode:static input_format:YUV420SP_U8 csc_switch:true var_reci_chn_0:0.00392157 var_reci_chn_1:0.00392157 var_reci_chn_2:0.00392157 }
Upload the configured insert_op.cfg file to any directory on the server where the ATC tool is located. In the following command, /home/Davinci/ is used as an example.
--insert_op_conf=/home/Davinci/insert_op.cfg
Dependencies and Restrictions
None
--op_name_map
Description
Sets the directory (including the file name) of the mapping configuration file of a custom operator. The function of a custom operator varies according to the network. You can specify the mapping between the custom operator and the actual custom operator running on the network.
See Also
None
Argument
Argument: Directory (including the file name) of the custom operator mapping configuration file.
Format: The directory and file name can contain letters, digits, underscores (_), hyphens (-), and periods (.).
Recommended Configurations and Benefits
None
Example
The following is an example of the content of the custom operator mapping configuration file (opname_map.cfg is used as an example file name):
OpA:Network1OpA
Upload the configured opname_map.cfg file to any directory on the server where the ATC tool is located. In the following command, /home/Davinci/ is used as an example.
--op_name_map=/home/Davinci/opname_map.cfg
Dependencies and Restrictions
None
--is_input_adjust_hw_layout
Description
Sets the data type and format of the network inputs to fp16 and NC1HWC0, respectively.
See Also
This option must be used in conjunction with --input_fp16_nodes.
If set to true, the data type and format of the --input_fp16_nodes inputs are set to fp16 and NC1HWC0, respectively.
Argument
Argument: false or true
Default: false
Recommended Configurations and Benefits
None
Example
atc --model=$HOME/test/resnet50.prototxt --weight=$HOME/test/resnet50.caffemodel --framework=0 --output=$HOME/test/out/caffe_resnet50 --is_input_adjust_hw_layout=true --input_fp16_nodes="data" --soc_version=${soc_version}
Dependencies and Restrictions
None
--is_output_adjust_hw_layout
Description
Sets the data type and format of the network outputs to fp16 and NC1HWC0, respectively.
See Also
This option must be used in conjunction with --out_nodes.
If set to true, the data type and format of the --out_nodes outputs are set to fp16 and NC1HWC0, respectively.
Argument
Argument: false or true
Default: false
Recommended Configurations and Benefits
None
Example
atc --model=$HOME/test/resnet50.prototxt --weight=$HOME/test/resnet50.caffemodel --framework=0 --output=$HOME/test/out/caffe_resnet50 --is_output_adjust_hw_layout=true --out_nodes="prob:0" --soc_version=${soc_version}
Dependencies and Restrictions
None
Model Tuning Options
--disable_reuse_memory
Description
Enables memory reuse.
See Also
None
Argument
Argument:
- 1: disabled
- 0: enabled
Default: 0
Restrictions: If the network model is large and the memory reuse function is disabled, the memory may be insufficient during model conversion. As a result, the model conversion fails.
Recommended Configurations and Benefits
None
Example
--disable_reuse_memory=0
Dependencies and Restrictions
Selects the operator to skip in memory reuse. (Memory reuse is enabled by default.) The operator (or operators separated by commas) specified by OP_NO_REUSE_MEM will use exclusively allocated memory.
- Configuration by node name:
export OP_NO_REUSE_MEM=gradients/logits/semantic/kernel/Regularizer/l2_regularizer_grad/Mul_1,resnet_v1_50/conv1_1/BatchNorm/AssignMovingAvg2
- Configuration by operator type:
export OP_NO_REUSE_MEM=FusedMulAddN,BatchNorm
- Configuration by operator name and operator type:
export OP_NO_REUSE_MEM=FusedMulAddN, resnet_v1_50/conv1_1/BatchNorm/AssignMovingAvg
--fusion_switch_file
Description
Sets the fusion switch configuration file directory, including the file name.
See Also
None
Argument
Argument: the configuration file directory, including the file name.
Format: The directory and file name can contain letters, digits, underscores (_), hyphens (-), and periods (.).
Restrictions: This option is required if you want to use the model quantized by the Ascend Model Compression and Training Toolkit (AMCT) for operator accuracy comparison. Configure this file to disable the fusion function.
Recommended Configurations and Benefits
None
Example
The following is an example of the configuration file. (fusion_switch.cfg is used as an example file name.) For details about the configuration file, see Fusion Rule Configuration.
RequantFusionPass:off TbeConvDequantVaddReluQuantFusionPass:off TbeConvDequantQuantFusionPass:off TbePool2dQuantFusionPass:off
Upload the configured fusion_switch.cfg file to any directory on the server where the ATC tool is located. In the following command, /home/Davinci/ is used as an example.
--fusion_switch_file=/home/Davinci/fusion_switch.cfg
Dependencies and Restrictions
Usage Restrictions: If the value of the group attribute of the Convolution operator in the network model is equal to the value of the num_output attribute in the prototxt file, RequantFusionPass in the preceding configuration file must be enabled.
--enable_scope_fusion_passes
Description
Enables specific fusion rules in build.
Fusion rules are classified into the following types:
- Built-in fusion rules provided by Huawei:
- General: General scope fusion rules are applicable to all networks. It takes effect by default and cannot be manually invalidated.
- Non-General: Non-general scope fusion rules are applicable to specific networks. By default, it does not take effect. You can use --enable_scope_fusion_passes to specify the fusion rule list that takes effect.
- Custom fusion rules:
- General: General scope fusion rules are enabled by default after being loaded and cannot be manually disabled.
- Non-General: Non-general scope fusion rules are disabled by default after being loaded. You can use --enable_scope_fusion_passes to enable specific rules as required.
See Also
None
Argument
Argument: Registered fusion rule names.
Restrictions: Multiple arguments are allowed. Separate them by commas (,), for example, ScopePass1,ScopePass2, ....
Recommended Configurations and Benefits
None
Example
--enable_scope_fusion_passes=ScopePass1,ScopePass2
Dependencies and Restrictions
Usage Restrictions: This option applies only to the TensorFlow network model. To view logs related to fusion rules during model conversion, set --log to at least the warning level.
--enable_small_channel
Description
Enables small channel optimization. If this function is enabled, performance benefits are generated at the convolutional layers with channel <= 4.
You are advised to enable this function in inference scenarios.
See Also
None
Argument
Argument:
- 0: disabled
- 1: enabled
Default: 0
Restrictions: After this option is included, performance benefits can be obtained on the GoogleNet, ResNet-50, ResNet-101, and ResNet-152 networks. For other networks, the performance may deteriorate.
Recommended Configurations and Benefits
None
Example
--enable_small_channel=1
Dependencies and Restrictions
Usage Restrictions: This option should be used in conjunction with the --insert_op_conf AIPP option. Otherwise, there may be no benefits.
Operator Tuning Options
--precision_mode
Description
Selects the operator precision mode.
See Also
None
Argument
Argument:
- force_fp16: forcibly selects fp16 even if the operator supports both fp16 and fp32.
- allow_fp32_to_fp16: if the operator supports fp32, the original precision fp32 is retained; if the operator does not support fp32, select fp16.
- must_keep_origin_dtype: retains the original precision. If the network model contains an operator that does not support the fp32 data type, such as the Conv2D convolution operator, this option cannot be used.
- allow_mix_precision: allows mixed precision.
If this mode is configured, you can view the value of precision_reduce in the /aic-${soc_version}-ops-info.json file in the ${install_path}/opp/op_impl/built-in/ai_core/tbe/config/${soc_version} of the OPP installation directory.
- If set to true, data precision of the operator is fp16.
- If set to false, data precision of the operator is fp32.
- If not set, data precision of the operator is the same as the previous operator.
Default: force_fp16
Recommended Configurations and Benefits
The accuracy and performance of the network model vary according to the configured precision mode.
Precision ranking: must_keep_origin_dtype > allow_fp32_to_fp16 > allow_mix_precision > force_fp16
Performance ranking: force_fp16 >= allow_mix_precision > allow_fp32_to_fp16 > must_keep_origin_dtype
Example
--precision_mode=force_fp16
Dependencies and Restrictions
None
--auto_tune_mode
Description
Selects the automatic tuning mode of the operator. This option enables TBE operator tuning during build to find the optimal performance configuration on the Ascend AI Processor.
For details about the principle and usage of the Auto Tune tool and supported operators, see Auto Tune Tool Instructions.
See Also
None
Argument
Argument:
- GA: Genetic Algorithm, for tuning Cube operators.
- RL: Reinforcement Learning, tuning Vector operators.
Format: Multiple modes can be configured. Enclose the whole argument in double quotation marks (""), and separate the modes by commas (,), for example, "RL,GA".
Recommended Configurations and Benefits
None
Example
--auto_tune_mode="RL,GA"
- GA tuning result:
- ${install_path}/opp/data/tiling/${soc_version}/built-in/ stores the built-in repository and cost model.
If a shape in the model does not hit the built-in repository, tuning is initiated.
- ${install_path}/atc/data/tiling/${soc_version}/custom/ stores the repository generated after tuning. If a repository exists in this path, information is added to the repository. If no repository exists in the directory, a new repository is created.
- ${install_path}/opp/data/tiling/${soc_version}/built-in/ stores the built-in repository and cost model.
- RL tuning result:
- ${install_path}/opp/data/rl/${soc_version}/built-in/ stores the built-in repository generated after tuning.
- ${install_path}/atc/data/rl/${soc_version}/custom/ stores the repository that has a better tuning performance than the repository in the built-in directory or does not exist in built-in directory during tuning.
- ${install_path}/opp/data/rl/${soc_version}/built-in/ stores the built-in repository generated after tuning.
Dependencies and Restrictions
Usage Restrictions:
- This function supports only the scenario where the development environment and operating environment are co-deployed. When this option is used for tuning, the following environment variables need to be set:
export LD_LIBRARY_PATH=${install_path}/acllib/lib64:${install_path}/atc/lib64:$LD_LIBRARY_PATH
- If a tuned repository exists in the custom directory and the operator logic is changed (for example, support for ND input is added to the GEMM operator), you need to set the following environment variable and perform tuning again.
export REPEAT_TUNE=True
- When auto tuning is enabled, memory allocation is needed for board evaluation. The Auto Tune tool may use more memory than the ATC tool. The required memory is related to the number of devices used at the same time and can be estimated as follows: 2 * Number of devices * Input data size. If the memory size exceeds the size of the ATC running memory, tuning fails.
- During GA tuning, the device resource is exclusively occupied. Therefore, other operations that require the device resource are not allowed. If the tuning fails, stop other processes and perform the tuning again. If there are build services running on the host, the tuning duration is affected.
--op_select_implmode
Description
Selects the operator implementation mode.
In high-precision mode, Newton's Method and Taylor's Formula is used to improve operator precision. In high-performance mode, the optimal performance is implemented without affecting the network precision.
See Also
None
Argument
Argument:
- high_precision: high precision implementation mode
- high_performance: high performance implementation mode
Default: high_performance
Recommended Configurations and Benefits
None
Example
--op_select_implmode=high_precision
Dependencies and Restrictions
Usage Restrictions:
--op_select_implmode indicates the high-precision or high-performance mode of all operators. If an operator supports both high-precision and high-performance, the mode specified by --op_select_implmode is implemented during running. If an operator supports only high-precision or high-performance, the mode supported is implemented during running. For example:
If an operator supports only high-precision, and --op_select_implmode is set to high_performance, the --op_select_implmode parameter does not take effect and the high-precision mode is implemented.
--optypelist_for_implmode
Description
Lists operator optypes.
See Also
This option must be used in conjunction with --op_select_implmode.
Argument
Argument: Operator list.
Restrictions: The operators in the list use the mode specified by --op_select_implmode. Only the Pooling operator is supported.
Recommended Configurations and Benefits
None
Example
--op_select_implmode=high_precision --optypelist_for_implmode=pooling
Dependencies and Restrictions
None
--op_debug_level
Description
Enables TBE operator debug during operator build.
See Also
None
Argument
Argument:
- 0: disables operator debug.
- 1: generates a TBE instruction mapping file for locating AI Core errors. In this case, an operator CCE file (*.cce) and a Python-CCE mapping file (*_loc.json) are generated in the kernel_meta folder in the ATC command execution directory.
- 2: generates a TBE instruction mapping file for locating AI Core errors. In this case, an operator CCE file (*.cce) and Python-CCE mapping file (*_loc.json) are generated in the kernel_meta folder in the ATC command execution directory and the CCE compiler -O0-g is enabled for disabling the tuning function during build.
Default: 0
Recommended Configurations and Benefits
None
Example
--op_debug_level=1
Dependencies and Restrictions
None
Debug Options
--dump_mode
Description
Dumps a JSON file with shape information.
See Also
This option must be used in conjunction with --json, --mode=1, --framework, and --om (as well as --weight for a Caffe original model file).
Argument
Argument:
- 0: disabled
- 1: enabled
Default: 0
Recommended Configurations and Benefits
None
Example
atc --mode=1 --om=$HOME/test/resnet18.prototxt --json=$HOME/test/out/resnet18.json --framework=0 --weight=$HOME/test/resnet18.caffemodel --dump_mode=1
Dependencies and Restrictions
None
--log
Description
Sets the level of logs to be printed during ATC model conversion.
See Also
None
Argument
Argument:
- debug: prints debug, info, warning, error, and event logs.
- info: generates info, warning, error, and event logs.
- warning: generates warning, error, and event logs.
- error: generates error and event logs.
- null: generates no logs.
Default: null
Recommended Configurations and Benefits
None
Example
--log=debug
Dependencies and Restrictions
- If the slog process does not exist in the current environment, logs are printed to the screen by default. You can include the --log=xx option to view the log level settings.
- If the slog process exists in the current environment (which can be checked with the ps -ef | grep slog command), logs are generated to log files by default. To print logs to the screen, set the following environment variable before running the atc command. (You can include the --log=xx option in your atc command to view the log level settings.)
export SLOG_PRINT_TO_STDOUT=1