CANN
Huawei Compute Architecture for Neural Networks (CANN) is a heterogeneous computing architecture for AI scenarios and provides multi-layer programming interfaces to help users quickly build AI applications and services based on the Ascend platform. Figure 5-5 shows the CANN architecture.
Module |
Function Description |
---|---|
AscendCL |
Ascend Computing Language (AscendCL) provides a collection of C++/Python APIs for users to develop deep neural network applications for object recognition and image classification, ranging from device management, context management, streams management, memory management, to model loading and execution, operator loading and execution, and media data processing. Users can call the AscendCL interface through a third-party framework to use the computing capability of the inference card or AI accelerator module. You can also use AscendCL encapsulation to implement third-party libraries to provide the running management and resource management capabilities of the inference card or AI accelerator module. When running an application, AscendCL calls APIs provided by the Graph Engine Executor (GE) to load and run models and operators, and calls Runtime APIs to manage devices, contexts, streams, and memory. |
GE Executor |
Graph Engine: controls center for graph compilation and running, which can be used for running environment management, execution engine management, operator library management, subgraph optimization management, graph operation management, and graph execution control. |
Runtime |
The runtime manager provides a resource management channel for task allocation of the neural network. The Ascend AI Processor runtime runs in the process space of the application program, and provides functions such as memory management, device management, stream management, event management, and kernel function execution for the application program. |
Operator/Acceleration library |
The operator library in the network model includes Caffe and TensorFlow operators. An independent acceleration library is provided, which is represented by the AscendCL APIs, for example, the matrix multiplication API. |
Driver |
Drivers that are compatible with different hardware and OSs, including PCIe drivers and drivers required for communication with computing resources, are provided. |
Auxiliary development tools |
Auxiliary tools for model or service development, including model conversion, model quantization, operator development and performance debugging. |
Figure 5-6 shows relationships between modules or components. This figure shows the AscendCL interface, which is used for application implementation, GE engine executor, runtime manager, and driver from top to bottom. Model loading supports the TF inference model and the Caffe inference model. The DVPP module performs digital vision pre-processing to encode, decode, scale, and crop images and videos. The CANN, as the hardware computing capability basis of inference cards or AI accelerator modules, performs matrix-related computation of neural networks, general computation of control operators, scalars, and vectors, execution control, as well as image and video data preprocessing. The CANN guarantees execution of deep neural network (NN) computing. For details about the functions of other modules, see Table 5-1.