Single-Operator Comparison
Command Syntax
The command for vector comparison is structured as follows:
python3.7.5 msaccucmp.pyc compare -m my_dump_path -g golden_dump_path [-f fusion_rule_file] [-out output] [-op op_name] [-o output_tensor] [-i input_tensor] [-c custom_script_path]
Table 5-3 describes the command-line options.
Option |
Description |
---|---|
-m --my_dump_path |
(Required) Directory of the dump data of the model running on the Ascend AI Processor |
-g --golden_dump_path |
(Required) Directory of the .npy file of the TensorFlow model running on the GPU or CPU |
-f --fusion_rule_file |
(Required) Network-wide information file (.json file generated from the .txt graph file using ATC) |
-out --output |
(Optional) Path of the comparison result. Defaults to the current path. |
-op --op_name |
(Optional) Name of the single-operator |
-o --output_tensor |
(Optional) Index of the output to compare. Mutually exclusive with -i. Valid only when -op is configured. If neither -o nor -i is included, the output indexed 0 is compared. |
-i --input_tensor |
(Optional) Index of the input to compare. Mutually exclusive with -o. Valid only when -op is configured. |
-c --custom_script_path |
(Optional) Customized path to store the .py file for format conversion, which should be the upper-level directory of the format_convert directory. For details about the .py file requirements, see Preparing a Customized .py File for Format Conversion. |
Comparison Procedure
To conduct vector comparison, perform the following steps:
The .json file and directory names in this section are only examples. Replace them with the actual ones. Ensure that the HwHiAiUser user has the read and write permissions on the result path specified by --out.
Single-operator comparison between two groups of dump data generated through the same training task by running on the Ascend AI Processor is not supported.
- Log in to the OS as the HwHiAiUser user.
- Run the export command to set the environment variable and generate a .json file.
Set the following environment variable:
export LD_LIBRARY_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/lib64:${LD_LIBRARY_PATH}
Convert the model file into a .json file:
/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/bin/atc --mode=5 --om=ge_proto_00005_Build.txt --json=ge_proto_00005_Build.txt.json
The ge_proto_00005_Build.txt file name in the preceding command line is used as an example. You can replace it as required.
Assume that the GE graphs are named ge_proto_*****_Build.txt. The GE graph with the IteratorV2, Iterator, or GetNext in the name field is the computational graph.
- Go to the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare directory.
- Run the vector comparison command as follows:
python3.7 msaccucmp.pyc compare -m /home/HwHiAiUser/MyApp_mind/resnet50 -g /home/HwHiAiUser/Standard_tf/resnet50 -f /home/HwHiAiUser/data/ge_proto_00005_Build.txt.json -out /home/HwHiAiUser/result -op gradients/AddN_63 -i 0
Figure 5-5 and Figure 5-6 show the content of a vector comparison result file.The single-operator comparison result summary is stored in {op_name}_input_{index}_summary.txt or {op_name}_output_{index}_summary.txt. The parameters are described as follows:
- TotalCount: number of data records in the dump data of the operator
- LeftOp: name of the dumped operator running on the Ascend AI Processor
- RightOp: name of the operator (running on the GPU or CPU) that generates the dump data or .npy file
- Format: data format
- MinAbsoluteError & MaxAbsoluteError: minimum and maximum absolute error ranges
- MinRelativeError & MaxRelativeError: minimum and maximum relative error ranges
The detailed comparison result of the single-operator is stored in {op_name}_input_{index}_{file_index}.csv or {op_name}_output_{index}_{file_index}.csv. Each file records a maximum of one million data records. The parameters in Figure 5-6 are described as follows:
- N,C,H,W: data coordinates.
- Left: dump value of the operator running on the Ascend AI Processor.
- Right: dump value of the operator running on the GPU or CPU.
- RelativeError: relative error. The value is obtained by dividing the AbsoluteError value by the dump value of the operator in the Right column. If the dump value of the operator in the Right column is 0, a hyphen (-) is displayed.
- AbsoluteError: absolute error. The value is the difference between the dump value of the operator in the Left column and the dump value of the operator in the Right column.