Environment Preparation
Introduction
The Profiling tool analyzes the key performance bottlenecks of an application project running on the Ascend AI Processor phase by phase and provides suggestions to achieve optimal product performance.
Specifically, the tool collects, analyzes, aggregates, and displays hardware and software profile data in the runtime of an application project.
- Hardware profile data: the performance monitor unit (PMU) events of AI Core and system hardware data
- Software profile data of the AscendCL, GE, and RTS modules
Restrictions
Note the following when using the Profiling tool:
- Profiling can be enabled by running the hiprof.pyc script from the command line, calling the acl.json file in an application project, and calling the AscendCL API. The prioritization is hiprof.pyc > acl.json > AscendCL API. If the AscendCL API is used, delete all related configurations in the acl.json file or do not pass the acl.json file to the aclInit() call.
- Profiling cannot initiate multiple profiling jobs with the same result directory. Otherwise, the profiling results may be inaccurate. Assume that the main program contains several independent inference jobs, errors occur when the main program is called through Profiling.
- It is not allowed to initiate two profiling jobs on the same device.
- Dump and Profiling should not be both enabled. To enable Profiling, disable Dump. If they are both enabled, Dump could affect the system performance, resulting in inaccurate profile data collected by Profiling.
- Profiling must be used with Python 3.7. Specifically, Python 3.7.5 is recommended.
- Profiling is not supported in a Docker container.
- The development of an application project must comply with Application Software Development Guide. To obtain complete profile data, call aclInit() to initialize AscendCL and call aclFinalize() to deinitialize AscendCL.
If only aclInit() is called by the application, the Profiling process will not end properly and incomplete profile data is collected. Profile data of the last second, which is less than or equal to 2 MB, may be lost due to synchronization failure. But the loss does not affect analyzing the synchronized data.
Preparations Before Using Profiling
Before using Profiling, set up the environment by referring to CANN Software Installation Guide.
In addition, log in to the host as the root user and configure the dependent dynamic library. The procedure is as follows:
All examples in this document have the same prerequisite: install the Driver, Firmware and AI CPU as the root user to the default installation path /usr/local/Ascend, and other components as the HwHiAiUser user to the default installation path /home/HwHiAiUser/Ascend. Replace them as required.
The operating environment in this document varies according to the actual scenario. It refers to the host for Ascend EP; the board environment for Ascend RC.
- Log in to the operating environment as the root user.
- Go to the file /etc/ld.so.conf and append the dynamic library path on which the application project depends to the file. The format is as follows:
- Ascend EP:
/home/HwHiAiUser/Ascend/nnrt/latest/acllib/lib64 /usr/local/Ascend/add-ons /usr/local/Ascend/driver/lib64
- Ascend RC:
/home/HwHiAiUser/Ascend/acllib/lib64 /usr/lib64
- Ascend EP:
- Run the ldconfig command for the settings to take effect.