compare_vector.pyc
Data Preparation
Dump File Naming Conventions
The current version supports multiple comparison approaches. Keep the following naming conventions in mind when creating dump files and .npy files.
Data Type |
Naming Format |
---|---|
Dump data of the non-quantized Caffe model |
{op_name}.{output_index}.{timestamp}.pb |
Dump data of the quantized Caffe model |
{op_name}.{output_index}.{timestamp}.quant |
Dump data of the non-quantized offline model running on the Ascend AI Processor |
{op_type}.{op_name}.{task_id}.{timestamp} |
Dump data of the quantized offline model running on the Ascend AI Processor |
{op_type}.{op_name}.{task_id}.{timestamp} |
Dump data of the non-quantized TensorFlow model |
{op_name}.{output_index}.{timestamp}.pb |
.npy file (Caffe or TensorFlow) |
{op_name}.{output_index}.{timestamp}.npy |
Where, op_type and op_name must comply with the A-Za-z0-9_- regular expression rule, timestamp is of 16 bits, and output_index and task_id are digits in the range 0–9.
Preparing Dump Data of an Offline Model
Prerequisites
Before preparing dump data, convert the model and prepare the model file by referring to ATC Tool Instructions. If model quantization is involved, perform quantization by referring to Ascend Model Compression Toolkit Instructions (Caffe) or Ascend Model Compression Toolkit Instructions (TensorFlow) before model conversion. In addition, build and run the application project with the generated model file and ensure that the project runs properly.
- During the execution of AMCT, a quantization fusion file used for accuracy comparison is also generated, which will be needed for model comparison.
- In Docker scenarios, dump is not supported in containers.
- Data dump can be implemented by calling the aclInit() or aclmdlSetDump() API.
Dump of single-operators can be implemented only by calling the aclmdlSetDump() API.
For details about the aclInit() and aclmdlSetDump() APIs, see Application Software Development Guide.
Generating Dump Data
Perform the following steps to dump data of the offline model:
- Open the project file, find the path of the acl.json file in the aclInit() or aclmdlSetDump() call.
If aclInit() or aclmdlSetDump() is initialized to empty, pass the acl.json path created in Step 2 to the call. The acl.json path is relative to the path of the binary file generated during project build.
- Modify the acl.json file in the directory (if the file does not exist, create it to the out directory after project build) and add the dump configuration in the following format.
{ "dump":{ "dump_list":[ { "model_name":"ResNet-101" }, { "model_name":"ResNet-50", "layer":[ "conv1conv1_relu", "res2a_branch2ares2a_branch2a_relu", "res2a_branch1", "pool1" ] } ], "dump_path":"/home/HwHiAiUser/output", "dump_mode":"output", "dump_op_switch":"off" } }
The dump configuration rules are as follows:- In the sample, the dump, dump_list, and dump_path fields are required, while the model_name, layer, dump_mode, and dump_op_switch fields are optional.
- To dump all operators of a model, the layer field does not need to be included.
- To dump certain operators, configure each operator in a line in the layer field and separate them with commas (,).
- To dump multiple models, add the dump configuration for each model and separate them with commas (,).
- To dump a single-operator model, leave dump_list empty and set dump_op_switch to on.
The configuration items are as follows:- dump_list: a list of network-wide models for data dump
- model_name: model name
To load a model from a file, enter the model file name without the file name extension. To load a model from memory, set this parameter to the value of the name field in the .json file after model conversion. For details about how to load a model, see AscendCL API Reference in Application Software Development Guide.
- layer: operator name
- model_name: model name
- dump_path: dump path in the operating environment
Both absolute path and relative path (relative to the path where the command is run) are supported.
- If set to an absolute path, starts with a slash (/), for example, /home/HwHiAiUser/output.
- If set to a relative path, starts with a directory name, for example, output.
For example, if dump_path is set to /home/HwHiAiUser/output, the dump data is generated to the /home/HwHiAiUser/output directory in the operating environment.
The directory specified by this parameter must be created in advance and the user configured during installation must have the read and write permissions on the directory.
- dump_mode: (optional) dump mode, either input, output (default), or all
- input: dumps the operator input only.
- output: dumps the operator output only.
- all: dumps both the operator input and output.
- dump_op_switch: (optional) dump switch of the single-operator model, either on or off (default)
- on: enables dump for the single-operator model.
- off: disables dump for the single-operator model.
- For TBE and AI CPU operators that do not output results, (for example, StreamActive, Send, Recv and const), dump data is not generated. For operators that are not executed on the AI CPU or AI Core (for example, concatD) after build, dump data cannot be generated.
- When only certain operators are to be dumped, since the Data operators are not executed on the AI CPU or AI Core, you need to select all the downstream operators of the Data operator.
- If dump data files are generated by running in the command line, delete the data files that are no longer needed in the /home/HwHiAiUser/ide_daemon/dump directory to prevent dump data generation failures due to insufficient disk space.
- When modifying the acl.json file, ensure that each model_name is unique.
- If the model is loaded from a file, model_name can also be set to the value of the name field in the .json file after model conversion. Model name and operator names can be obtained using either of the following methods:
If the value of model_name in the acl.json file contains both the model file name and the model name, the model file name takes effect.
- Run the application project to generate a dump file.
Find the generated dump file in {dump_path} in the operating environment. The path and format are described as follows:
time/deviceid/model_name/model_id/data_index/dump_file
- time: dump time, formatted as YYYYMMDDhhmmss
- deviceid: device ID
- model_name: model name
- model_id: model ID
- data_index: execution sequence number of each task, indexed starting at 0. This value is increased by 1 every dump.
- dump_file: formatted as {op_type}.{op_name}.{taskid}.{timestamp}
Periods (.), forward slashes (/), backslashes (\), and spaces in model_name, op_type or op_name are replaced with underscores (_).
Dump data of a single-operator model is generated to {dump_path}/deviceid/op_name/dump_file.
Preparing .npy Data of a Caffe Model
This version does not support the generation of .npy files of a Caffe model. You need to install the Caffe environment and prepare the .npy file in advance. This section provides only a Caffe .npy file example that meets the accuracy comparison requirements.
How to prepare the .npy file of a Caffe model is not detailed herein. You can prepare it by yourself.
To use a dump file in binary format for comparison, convert the .npy file to a dump file by referring to How Do I Convert an .npy File into a Dump File?.
Prepare the .npy file as follows:
- Save the file content in NumPy format.
- Name the file in op_name.output_index.timestamp.npy format. Ensure that the output_index field is contained in the .npy file name and the output_index value of the generated dump data is 0. This is because by default, accuracy comparison starts from the first data whose output_index is 0. Otherwise, no comparison result is returned.
- To ensure that the .npy file is correctly named, remove in-place on the Caffe model file to generate a new .prototxt model file for .npy file generation. For example, if there are four fused operators A, B, C, and D whose in-place is not removed, the output data dumping result is that of operator D, while the file name is prefixed with operator A. For quantization scenarios, install AMCT in the environment before running the command to remove in-place. For details about the installation, see Ascend Model Compression Toolkit Instructions (Caffe).
Go to the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare directory and run the following command to remove in-place:
python3.7.5 inplace_layer_process.pyc -i /home/user/resnet50.prototxt
After the command is executed, the new_resnet50.prototxt file with in-place removed is generated in the /home/user directory.
- For quantization scenarios, to ensure accuracy, data pre-processing during Caffe model inference must be the same as that during Caffe model compression.
To generate an .npy file that meets the accuracy comparison requirements, add similar code as follows after the inference is complete.
#read prototxt file net_param = caffe_pb2.NetParameter() with open(self.model_file_path, 'rb') as model_file: google.protobuf.text_format.Parse(model_file.read(), net_param) # save data to numpy file for layer in net_param.layer: name = layer.name.replace("/", "_").replace(".", "_") index = 0 for top in layer.top: data = net.blobs[top].data[...] file_name = name + "." + str(index) + "." + str( round(time.time() * 1000000)) + ".npy" output_dump_path = os.path.join(self.output_path, file_name) np.save(output_dump_path, data) os.chmod(output_dump_path, FILE_PERMISSION_FLAG) print('The dump data of "' + layer.name + '" has been saved to "' + output_dump_path + '".') index += 1
After the preceding code is added, run the application project of the Caffe model to generate an .npy file that meets the requirements.
Preparing .npy Data of a TensorFlow Model
This version does not support the generation of .npy files of a TensorFlow model. You need to install the TensorFlow environment and prepare .npy file in advance. This section provides only an example of the TensorFlow .npy file for reference.
To use a dump file in binary format for comparison, convert the .npy file to a dump file by referring to How Do I Convert an .npy File into a Dump File?.
Before generating .npy files of a TensorFlow model, a complete, executable, and standard TensorFlow model application project is required. You can use the TensorFlow debugger (tfdbg) to generate .npy files. The major steps are as follows:
- Modify the TensorFlow application project script to add the debugging configuration option by adding the following code:
In Estimator mode:
from tensorflow.python import debug as tf_debug
training_hooks = [train_helper.PrefillStagingAreaHook(), tf_debug.LocalCLIDebugHook()]
In session.run mode:
from tensorflow.python import debug as tf_debug
sess = tf_debug.LocalCLIDebugWrapperSession(sess, ui_type="readline")- In Estimator mode, add the tfdbg hook, as shown in Figure 5-7.
- In session.run mode, set the tfdbg wrapper before run, as shown in Figure 5-8.
- Run the inference script.
In the interactive debugger command line, enter run to run the script.
- Collect .npy files.
After the script is executed, you can run the lt command to query the stored tensors, run the pt command to view the tensor content, and save it as a file in NumPy format.
The tfdbg dumps only one tensor at a time. To dump all tensors, perform the following steps:
- Run lt > tensor_name to temporarily store all tensor names to a file.
- Exit the tfdbg command line, enter the Linux command line, and run the following command to generate commands to run in tfdbg:
timestamp=$[$(date +%s%N)/1000] ; cat tensor_name | awk '{print "pt",$4,$4}' | awk '{gsub("/", "_", $3);gsub(":", ".", $3);print($1,$2,"-n 0 -w "$3".""'$timestamp'"".npy")}' > tensor_name_cmd.txt
- The tensor names in the example are stored in the tensor_name_cmd.txt file. The .npy file names meet the naming rules for accuracy comparison, where, tensor_name is the name of the file that stores all tensor names and timestamp is of 16 bits.
- You can also run the command in the new window without exiting the tfdbg command line.
- Go back to the tfdbg command line, run the script, and run the command generated in the previous step for saving all .npy files.
By default, .npy files are stored using numpy.save(). Slashes (/) and colons (:) are replaced by underscores (_).
If the command cannot be pasted on the CLI, run the mouse off command in the tfdbg command line to disable the mouse mode before pasting again.
- Check whether names of the generated .npy files comply with the naming rules, as shown in Figure 5-9.
- Names of the .npy files are in {op_name}.{output_index}.{timestamp}.{npy} format, where op_name must comply with the A-Za-z0-9_- regular expression rule, timestamp is of 16 bits, and output_index is a digit in the range 0–9.
- If the name of an .npy file exceeds 255 characters due to the long operator name, comparison of this operator is not supported.
- The name of some .npy files may not meet the naming requirements due to the tfdbg or operating environment. You can manually rename the files based on the naming rules. If there are a large number of .npy files that do not meet the requirements, generate .npy files again by referring to How Do I Handle Exceptions in the Generated .npy File Names in Batches?
Vector Comparison
Restrictions
Vector comparison by model and by single operator are supported. Select a comparison mode as required.
Note the following restrictions on vector comparison:
- Dump data of the offline mode and the .npy file or dump data of the third-party model (from Caffe or TensorFlow) to be compared should be obtained from the counterpart models. Otherwise, the comparison results of only the counterpart operators are displayed.
- For the Fast R-CNN network, the comparison result is subject to the accuracy of the FSRDetectionOutput operators. It is normal that the ProposalD operator and its downstream operators offer low accuracy.
- During graph build, if some operators of the graph are fused, the outputs of these operators can no longer be found in the built model. As a result, the comparison of these operators is unavailable.
- During graph build, if the structure of a graph is modified (such as strided slice, L1 fusion, and L2 fusion), the comparison of the inputs or outputs of the operators is unavailable.
- When the counterpart operators require different shapes (for example, the offline model operator requires a reduced shape), or format conversion is not supported, the comparison of these operators is unavailable.
- If Data Pre-Processing is switched on (for example, the input of the data operator is set to YUV in AIPP), input format of the data operators may be different from that of the original model, leading to an unreliable comparison result.
- If the corresponding fusion rule is enabled, a quantized operator will be fused with its upstream operator. As a result, the comparison result of the operator output is unreliable.
- In a quantized model, the comparison of quantized operators is only available after being dequantized. Comparison of non-quantized operators in a quantized model is available. For example, in a quantized model, the comparison of the output of the AscendQuant operator is not available.
Comparison Data Description
Before vector comparison, prepare the data by referring to Table 5-8.
Select files for My Output as follows:
- Offline model file: Required if to compare the dump data of a model running on the Ascend AI Processor with Ground Truth.
- Quantization fusion file: Required if to compare quantized and non-quantized dump data.
No. |
Data to Be Compared (My Output) |
Standard Data (Ground Truth) |
Model File/Quantization Fusion File |
---|---|---|---|
1 |
Dump data of the non-quantized offline model running on the Ascend AI Processor |
Dump data or .npy file of the non-quantized Caffe model |
Non-quantized offline model file (.om) |
2 |
Dump data of the quantized offline model running on the Ascend AI Processor |
Dump data or .npy file of the non-quantized Caffe model |
|
3 |
Dump data or .npy file of the quantized Caffe model |
Dump data or .npy file of the non-quantized Caffe model |
Quantization fusion file (.json) after model compression |
4 |
Dump data of the quantized offline model running on the Ascend AI Processor |
Dump data or .npy file of the quantized Caffe model |
Quantized offline model file (.om) |
5 |
Dump data of the non-quantized offline model running on the Ascend AI Processor |
Dump data or .npy file of the non-quantized TensorFlow model |
Non-quantized offline model file (.om) |
6 |
Dump data of the model running on the Ascend AI Processor |
Dump data of the model running on the Ascend AI Processor |
- |
Model Comparison
Command Syntax
The command for vector comparison is structured as follows:
python3.7.5 compare_vector.pyc -l LEFT_DUMP_PATH -r RIGHT_DUMP_PATH [-f FUSION_JSON_FILE_PATH] [-q QUANT_FUSION_RULE_FILE_PATH] -o OUTPUT_PATH [-custom CUSTOM_PATH]
- -l: directory of the compared data of the My Output model
- -r: directory of the compared data of the Ground Truth model
- -f: (optional) network-wide information file of the offline model (the .om file can be converted into a .json file using ATC)
- -q: (optional) quantization fusion file
- -o: path and file name of the comparison result
- -custom: (optional) customized path to store the .py file for format conversion, which should be the upper-level directory of the format_convert directory. For details about the .py file requirements, see Preparing a Customized .py File for Format Conversion.
Select the -f or -q argument based on the data prepared in Comparison Data Description.
Comparison Procedure
To conduct vector comparison, perform the following steps:
- Replace the .json file and directory names in this example with the actual ones. The result and log paths must be created in advance and the HwHiAiUser user must have the read and write permissions on the paths.
- This section describes how to compare the dump data of a non-quantized model running on the Ascend AI Processor and the .npy file of a non-quantized Caffe model. The following parameters are based on this example. You can replace them as required.
- To compare two groups of dump data generated based on the same model running on the Ascend AI Processor, ensure that the number of inputs and outputs, formats, as well as the shapes are the same. Otherwise, comparison cannot be performed. In this scenario, only the -l, -r and -o options are required.
- Log in to the development environment as the HwHiAiUser user.
- Run the export command to set the environment variable and generate a .json file.
Set the following environment variable:
export LD_LIBRARY_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/lib64:${LD_LIBRARY_PATH}
Generate the .json file:
/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/bin/atc --mode=1 --om=/home/HwHiAiUser/data/resnet50.om --json=/home/HwHiAiUser/data/resnet50.json
- Go to the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare directory.
- Run the vector comparison command as follows:
python3.7.5 compare_vector.pyc -l /home/HwHiAiUser/MyApp_mind/resnet50 -r /home/HwHiAiUser/Standard_caffe/resnet50 -f /home/HwHiAiUser/data/resnet50.json -o /home/HwHiAiUser/result/result.txt
To save the file in CSV format, modify the command into -o /home/HwHiAiUser/result/result.csv -csv.
- The vector comparison result is saved to the result.txt file, as shown in Figure 5-10.
The parameters are described as follows:
- LeftOp: operator name of the My Output model
- RightOp: operator name of the Ground Truth model
- TensorIndex: operator input ID and output ID of the My Output model
- CosineSimilarity: result of the cosine similarity comparison. The value range is [–1, +1]. A value closer to 1 indicates higher similarity. A value closer to –1 indicates greater difference.
- MaxAbsoluteError: result of the maximum absolute error comparison. A value closer to 0 indicates higher similarity. Otherwise, it indicates greater difference.
- AccumulatedRelativeError: result of the accumulated relative error comparison. A value closer to 0 indicates higher similarity. Otherwise, it indicates greater difference.
- RelativeEuclideanDistance: result of the Euclidean relative distance comparison. A value closer to 0 indicates higher similarity. Otherwise, it indicates greater difference.
- KullbackLeiblerDivergence: result of the KLD comparison. The value ranges from 0 to infinity. The smaller the KLD, the closer the approximate distribution is to the true distribution.
- StandardDeviation: result of the standard deviation comparison. The value ranges from 0 to infinity. The smaller the standard deviation, the smaller the dispersion, and the closer the value is to the average value. The mean value and standard deviation of the dump data are displayed in the format of (mean value;standard deviation). The first set of data is the result of My Output, and the second set is the result of Ground Truth.
- An asterisk (*) indicates a newly added operator with no third-party counterpart. NaN indicates there is no comparison result.
- If the results of cosine similarity and KLD are NaN, and the results of other algorithms exist, at least one piece of data on the left or the right is 0. If the result of KLD is inf, one piece of data on the right is 0. If NaN is displayed, the dump data contains NaN.
Single-Operator Comparison
Command Syntax
The command for vector comparison is structured as follows:
python3.7.5 compare_vector.pyc -l LEFT_DUMP_PATH -r RIGHT_DUMP_PATH [-f FUSION_JSON_FILE_PATH] [-q QUANT_FUSION_RULE_FILE_PATH] -o OUTPUT_PATH -d OP_NAME [-t DETAIL_TYPE] [-i DETAIL_INDEX] [-custom CUSTOM_PATH]
The command-line options are described as follows:
- -l: directory of the compared data of the My Output model
- -r: directory of the compared data of the Ground Truth model
- -f: (optional) network-wide information file of the offline model (the .om file can be converted into a .json file using ATC)
- -q: (optional) quantization fusion file
- -o: path of the comparison result
- -d: name of the single-operator layer to be compared of the My Output model
- -t: (optional) indicates whether the dump file is the input or output (default).
- -i: (optional) input_index or output_index of the operator of the My Output model
- -custom: (optional) customized path to store the .py file for format conversion, which should be the upper-level directory of the format_convert directory. For details about the .py file requirements, see Preparing a Customized .py File for Format Conversion.
Select the -f or -q argument based on the data prepared in Comparison Data Description.
Comparison Procedure
To conduct vector comparison, perform the following steps:
- Replace the .json file and directory names in this example with the actual ones. The result and log paths must be created in advance and the HwHiAiUser user must have the read and write permissions on the paths.
- This section describes how to compare the dump data of a non-quantized model running on the Ascend AI Processor and the .npy file of a non-quantized Caffe model. The following parameters are based on this example. You can replace them as required.
- Single-operator comparison between two groups of dump data generated based on the same model running on the Ascend AI Processor is not supported.
- Log in to the development environment as the HwHiAiUser user.
- Run the export command to set the environment variable and generate a .json file.
Set the following environment variable:
export LD_LIBRARY_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/lib64:${LD_LIBRARY_PATH}
Generate the .json file:
/home/HwHiAiUser/Ascend/ascend-toolkit/latest/atc/bin/atc --mode=1 --om=/home/HwHiAiUser/data/resnet50.om --json=/home/HwHiAiUser/data/resnet50.json
- Go to the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare directory.
- Run the vector comparison command as follows:
python3.7.5 compare_vector.pyc -l /home/HwHiAiUser/MyApp_mind/resnet50 -r /home/HwHiAiUser/Standard_caffe/resnet50 -f /home/HwHiAiUser/data/resnet50.json -o /home/HwHiAiUser/result -d pool5 -i 0
- Figure 5-11 and Figure 5-12 show the content of a vector comparison result file.
The single-operator comparison result summary is stored in {op_name}_input_{index}_summary.txt or {op_name}_output_{index}_summary.txt. The parameters are described as follows:
- TotalCount: number of data records in the dump data of the operator
- LeftOp: operator name of the My Output model
- RightOp: operator name of the Ground Truth model
- Format: data format
- MinAbsoluteError & MaxAbsoluteError: minimum and maximum absolute error ranges
- MinRelativeError & MaxRelativeError: minimum and maximum relative error ranges
The detailed comparison result of the single-operator is stored in {op_name}_input_{index}_{file_index}.csv or {op_name}_output_{index}_{file_index}.csv. Each file records a maximum of one million data records. The parameters in Figure 5-12 are described as follows:
- N C H W: data coordinates
- Left: dump data of the My Output model operator
- Right: dump data of the Ground Truth model operator
- RelativeError: relative error. The value is obtained by dividing the AbsoluteError value by the dump value of the Ground Truth operator. If the dump value of the Ground Truth operator is 0, a hyphen (-) is displayed.
- AbsoluteError: absolute error. The value is the difference between the dump value of the My Output operator and that of the Ground Truth operator.
How Do I Convert an .npy File into a Dump File?
Converting the Caffe .npy File into a Dump File
After obtaining the Caffe .npy file, execute the dump_data_conversion.pyc script to convert it into a dump file in binary format. The command syntax is as follows:
python3.7.5 dump_data_conversion.pyc -type TYPE -target TARGET -i INPUT_PATH -o OUTPUT_PATH
- -type: (required) data type, selected from:
- quant: quantized Caffe model data
- tf: non-quantized TensorFlow model data
- caffe: non-quantized Caffe model data
- offline: offline model data
- sim: functional simulation data
- -target: (required) target format, either numpy or dump
- numpy: converts a dump file into a NumPy file.
- dump: converts a NumPy file into a dump file.
- -i: (required) data folder/file path
- Converting NumPy files into dump files:
To pass a folder path to i, ensure that names of the files in the folder are in op_name.output_index.timestamp.npy format.
To pass a file path to i, ensure that the name of the file is in op_name.output_index.timestamp.npy format. Only one file can be set at a time.
op_name must comply with the A-Za-z0-9_- regular expression rule, timestamp is of 16 bits, and output_index is a digit in the range 0–9.
- Converting dump files into NumPy files:
To pass a folder path to i, ensure that names of the files in the folder comply with the naming conventions described in Dump File Naming Conventions. To pass a file path to i, ensure that the name of the file complies with the naming conventions described in Dump File Naming Conventions.
- Converting NumPy files into dump files:
- -o: (required) output path
The following is a command example for converting the .npy file into a dump file:
python3.7.5 dump_data_conversion.pyc -type caffe -target dump -i /home/HwHiAiUser/caffenpyfile -o /home/HwHiAiUser/caffedump
- You can find the dump_data_conversion.pyc script in /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare.
- To use this script for conversion, ensure that the host has at least 15 GB disk space. If the size of a single dump file to be converted exceeds 441 MB, you are advised to use a host with larger disk space.
Converting the TensorFlow .npy File into a Dump File
After obtaining the TensorFlow .npy file, execute the dump_data_conversion.pyc script to convert it into a dump file in binary format. The command syntax is as follows:
python3.7 dump_data_conversion.pyc -type TYPE -target TARGET -i INPUT_PATH -o OUTPUT_PATH
- -type: (required) data type, selected from:
- quant: quantized Caffe model data
- tf: non-quantized TensorFlow model data
- caffe: non-quantized Caffe model data
- offline: offline model data
- sim: functional simulation data
- -target: (required) target format, either numpy or dump
- numpy: converts a dump file into a NumPy file.
- dump: converts a NumPy file into a dump file.
- -i: (required) data folder/file path
- Converting NumPy files into dump files:
To pass a folder path to i, ensure that names of the files in the folder are in op_name.output_index.timestamp.npy format.
To pass a file path to i, ensure that the name of the file is in op_name.output_index.timestamp.npy format. Only one file can be set at a time.
op_name must comply with the A-Za-z0-9_- regular expression rule, timestamp is of 16 bits, and output_index is a digit in the range 0–9.
- Converting dump files into NumPy files:
To pass a folder path to i, ensure that names of the files in the folder comply with the naming conventions described in Dump File Naming Conventions. To pass a file path to i, ensure that the name of the file complies with the naming conventions described in Dump File Naming Conventions.
- Converting NumPy files into dump files:
- -o: (required) output path
The following is a command example for converting the .npy file into a dump file:
python3.7 dump_data_conversion.pyc -type tf -target dump -i /home/HwHiAiUser/tfnpyfile -o /home/HwHiAiUser/tfdump
- You can find the dump_data_conversion.pyc script in /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare.
- To use this script for conversion, ensure that the host has at least 15 GB disk space. If the size of a single dump file to be converted exceeds 441 MB, you are advised to use a host with larger disk space.
How Do I Convert the Format of a Dump File?
Converting the Format of a Dump File
In the current version, dump files generated by running on the Ascend AI Processor can be converted into NumPy format.
To perform the conversion, execute the shape_conversion.pyc script stored in the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare path. The command syntax is as follows:
python3.7.5 shape_conversion.pyc -i DUMP_FILE_PATH -format FORMAT -o OUTPUT_PATH [-shape SHAPE] [-tensor TENSOR] [-index INDEX] [-custom CUSTOM_PATH]
The command-line options are described as follows:
- -i: dump file (including the path) of a model running on the Ascend AI Processor
- -format: format of the converted file
- -o: directory of the converted file
- -shape: (optional) shape to be set for FRACTAL_NZ conversion. The shape format is ([0-9]+,)+[0-9]+, where each number must be greater than 0.
- -tensor: (optional) indicates whether the dump file to be converted is the input or output. Defaults to output.
- -index: (optional) sequence number of a tensor, indexed starting at 0.
- -custom: (optional) customized path to store the .py file for format conversion, which should be the upper-level directory of the format_convert directory. For details about the .py file requirements, see Preparing a Customized .py File for Format Conversion.
The result is saved in the format of original_file_name.output.{index}.{shape}.npy or original_file_name.input.{index}.{shape}.npy, where shape is formatted as 1x3x224x224.
If the custom format is the same as the built-in format, the custom format applies.
Currently, the following built-in format conversion types are supported:
- FRACTAL_NZ to NCHW
- FRACTAL_NZ to ND
- HWCN to FRACTAL_Z
- HWCN to NCHW
- HWCN to NHWC
- NC1HWC0 to HWCN
- NC1HWC0 to NCHW
- NC1HWC0 to NHWC
- NCHW to FRACTAL_Z
- NCHW to NHWC
- NHWC to FRACTAL_Z
- NHWC to HWCN
- NHWC to NCHW
Preparing a Customized .py File for Format Conversion
Prepare the file as follows:
- The name of the .py file is in convert_{format_from}_to_{format_to}.py format. The supported formats for format _from and format _to are as follows:
- NCHW
- NHWC
- ND
- NC1HWC0
- FRACTAL_Z
- NC1C0HWPAD
- NHWC1C0
- FSR_NCHW
- FRACTAL_DECONV
- C1HWNC0
- FRACTAL_DECONV_TRANSPOSE
- FRACTAL_DECONV_SP_STRIDE_TRANS
- NC1HWC0_C04
- FRACTAL_Z_C04
- CHWN
- DECONV_SP_STRIDE8_TRANS
- NC1KHKWHWC0
- BN_WEIGHT
- FILTER_HWCK
- HWCN
- LOOKUP_LOOKUPS
- LOOKUP_KEYS
- LOOKUP_VALUE
- LOOKUP_OUTPUT
- LOOKUP_HITS
- MD
- NDHWC
- C1HWNCoC0
- FRACTAL_NZ
- The content of the .py file is as follows:
def convert(shape_from, shape_to, array): return numpy_array
The parameters are described as follows:
shape_from: shape of the one-dimensional array before conversion
shape_to: (optional) shape of the one-dimensional array after conversion
array: one-dimensional source data
return: NumPy array after conversion
- The directory of the .py file must meet the following requirement:
The .py file must be stored in the format_convert directory. If the directory does not exist, create one.
How Do I View a Dump File?
Dump files cannot be viewed with a text tool. Therefore, you need to convert your dump file into a NumPy file and save the NumPy file as a text file using numpy.savetxt.
- Log in to the development environment as the installation user.
- Go to the /home/HwHiAiUser/Ascend/ascend-toolkit/latest/toolkit/tools/operator_cmp/compare directory.
- Run the dump_data_conversion.pyc script to convert the dump file into a NumPy file. The following is an example:
python3.7.5 dump_data_conversion.pyc -target numpy -type offline -i /home/HwHiAiUser/dump -o /home/HwHiAiUser/dumptonumpy
For details about the parameters of the dump_data_conversion.pyc script, see How Do I Convert an .npy File into a Dump File?.
- Use Python to save the NumPy data into a text file. The following is an example:
$ python3.7.5
Python 3.7.5 (default, Mar 5 2020, 16:07:54)[GCC 5.4.0 20160609] on linuxType "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a = np.load("/home/HwHiAiUser/dumptonumpy/Pooling.pool1.1147.1589195081588018.output.0.npy")
>>> b = a.flatten()
>>> np.savetxt("/home/HwHiAiUser/dumptonumpy/Pooling.pool1.1147.1589195081588018.output.0.txt", b)
The dimension information and Dtype no longer exist in the .txt file. For details, visit the NumPy website.