Model Loading and Execution
- Function: load_from_file
- Function: load_from_mem
- Function: load_from_file_with_mem
- Function: load_from_mem_with_mem
- Function: load_from_file_with_q
- Function: load_from_mem_with_q
- Function: execute
- Function: execute_async
- Function: unload
- Function: query_size
- Function: query_size_from_mem
- Function: set_dynamic_batch_size
- Function: set_dynamic_hw_size
Function: load_from_file
C Prototype |
aclError aclmdlLoadFromFile(const char *modelPath, uint32_t *modelId) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_file(model_path) |
Function Usage |
Loads offline model data (adapted to the Ascend AI processor) from a file. It is a synchronous interface. The memory is managed by the system. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model_path: str, path for storing the offline model file. The file name is contained in the path. The user who runs the application must have the permission to access the path. The offline model file is the *.om file that adapts to the Ascend AI processor. |
Return Value |
model_id: int, pointer object of the model ID generated after the system loads the model. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: load_from_mem
C Prototype |
aclError aclmdlLoadFromMem(const void* model, size_t modelSize, uint32_t* modelId) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_mem(model, model_size) |
Function Usage |
Loads offline model data from memory. The model execution memory is managed by the system. It is a synchronous interface. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model: int, pointer object of the memory address of the model data.
model_size: int, length of the model data in the memory, in bytes. |
Return Value |
model_id: int, model ID generated after the model is loaded. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: load_from_file_with_mem
C Prototype |
aclError aclmdlLoadFromFileWithMem(const char *modelPath, uint32_t *modelId, void *workPtr, size_t workSize, void *weightPtr, size_t weightSize) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_file_with_mem(model_path, work_ptr, work_size, weight_ptr, weight_size) |
Function Usage |
Loads offline model data (adapted to the Ascend AI processor) from a file. It is a synchronous interface. You can manage the memory for running the model. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model_path: str, path for storing the offline model file. The file name is contained in the path. The user who runs the application must have the permission to access the path. The offline model file is the *.om file that adapts to the Ascend AI processor. work_ptr: int, address of the working memory (for storing model input and output data) required by the model on the device. The address is managed by users. work_size: int, size of the working memory required by the model, in bytes. weight_ptr: int, pointer of the model weight memory (for storing weight data) on the device. The pointer is managed by users. weight_size: int, size of the weight memory required by the model, in bytes. |
Return Value |
model_id: model ID generated after the model is loaded. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: load_from_mem_with_mem
C Prototype |
aclError aclmdlLoadFromMemWithMem(const void *model, size_t modelSize, uint32_t *modelId, void *workPtr, size_t workSize, void *weightPtr, size_t weightSize) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_mem_with_mem(model, model_size, work_ptr, work_size, weight_ptr, weight_size) |
Function Usage |
Loads offline model data from memory. The model execution memory is managed by the user. It is a synchronous interface. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model: int, memory address of the model data.
model_size: int, model data length, in bytes. work_ptr: int, address of the working memory (for storing model input and output data) required by the model on the device. The address is managed by users. work_size: int, size of the working memory required by the model, in bytes. weight_ptr: int, pointer of the model weight memory (for storing weight data) on the device. The pointer is managed by users. weight_size: int, size of the weight memory required by the model, in bytes. |
Return Value |
model_id: int, model ID generated after the model is loaded. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: load_from_file_with_q
C Prototype |
aclError aclmdlLoadFromFileWithQ(const char *modelPath, uint32_t *modelId, const uint32_t *inputQ,size_t inputQNum, const uint32_t *outputQ, size_t outputQNum) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_file_with_q(model_path, input_q, input_q_num, output_q, output_q_num) |
Function Usage |
Loads offline model data (adapted to Ascend AI processors) from a file. The input and output model data is stored in the queue. It is a synchronous interface. The current version does not support this interface. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model_path: str, path for storing the offline model file. The file name is contained in the path. The user who runs the application must have the permission to access the path. The offline model file is the *.om file that adapts to the Ascend AI processor. input_q: int, queue ID. Each input model corresponds to a queue ID. input_q_num: int, input queue size. output_q: int, queue ID. Each output model corresponds to a queue ID. output_q_num: int, size of the output queue. |
Return Value |
model_id: model ID generated after the model is loaded. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: load_from_mem_with_q
C Prototype |
aclError aclmdlLoadFromMemWithQ(const void *model, size_t modelSize, uint32_t *modelId,const uint32_t *inputQ, size_t inputQNum, const uint32_t *outputQ, size_t outputQNum) |
---|---|
Python Function |
model_id, ret = acl.mdl.load_from_mem_with_q(model, model_size, input_q, input_q_num, output_q, output_q_num) |
Function Usage |
Loads offline model data (adapted to Ascend AI processors) from the memory. The input and output model data is stored in the queue. It is a synchronous interface. The current version does not support this interface. Returns model ID after the model is loaded. The model ID is used for model identification in subsequent operations. |
Input Description |
model: int, memory address of the model data. model_size: int, length of the model data in the memory, in bytes. input_q: int, queue ID. Each input model corresponds to a queue ID. input_q_num: int, input queue size. output_q: int, queue ID. Each output model corresponds to a queue ID. output_q_num: int, size of the output queue. |
Return Value |
model_id: model ID generated after the model is loaded. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: execute
C Prototype |
aclError aclmdlExecute(uint32_t modelId, const aclmdlDataset *input, aclmdlDataset *output) |
---|---|
Python Function |
ret = acl.mdl.execute(model_id, input, output_) |
Function Usage |
Executes model inference until the result is returned. It is a synchronous interface. |
Input Description |
model_id: int, ID of the model to be inferred. input: int, pointer object of the input data of model inference. output: int, pointer object of the output data of model inference. |
Return Value |
ret: int, error code.
|
Restrictions |
Resources (such as streams and memory) associated with a single model_id are unique. Therefore, concurrent use in a multithreaded process is not allowed. Otherwise, the service is unavailable. |
Precautions |
The input and output pointer objects must be created in advance. |
Function: execute_async
C Prototype |
aclError aclmdlExecuteAsync(uint32_t modelId, const aclmdlDataset *input, aclmdlDataset *output, aclrtStream stream) |
---|---|
Python Function |
ret = acl.mdl.execute_async(modelId, input, output_, stream) |
Function Usage |
Executes model inference. It is an asynchronous interface. |
Input Description |
model_id: int, ID of the model to be inferred. input: int, pointer object of the input data of model inference. output: int, pointer object of the output data of model inference. stream: int, pointer object of the stream. |
Return Value |
ret: int, error code.
|
Restrictions |
|
Precautions |
The stream, input, and output pointer objects must be created in advance. |
Function: unload
C Prototype |
aclError aclmdlUnload(uint32_t modelId) |
---|---|
Python Function |
ret = acl.mdl.unload(model_id) |
Function Usage |
Uninstalls the model and releases resources after the model inference is complete. It is a synchronous interface. |
Input Description |
model_id: int, ID of the model to be uninstalled. |
Return Value |
ret: int, error code.
|
Restrictions |
When calling the acl.mdl.unload API to unload a model, ensure that the model is not being used by other APIs. |
Precautions |
None |
Function: query_size
C Prototype |
aclError aclmdlQuerySize(const char *fileName, size_t *workSize, size_t *weightSize) |
---|---|
Python Function |
work_size, weight_size, ret = acl.mdl.query_size(file_name) |
Function Usage |
Obtains the weight memory size and working memory size required for model execution based on the model file. It is a synchronous interface. |
Input Description |
file_name: str, path of the model that needs to obtain memory information. The file name is contained in the path. The user who runs the application must have the permission to access the path. |
Return Value |
work_size: int, size of the working memory required for model execution, in bytes. weight_size: int, size of the weight memory required for model execution, in bytes. ret: int, error code.
|
Restrictions |
None |
Precautions |
None |
Function: query_size_from_mem
C Prototype |
aclError aclmdlQuerySizeFromMem(const void *model, size_t modelSize, size_t *workSize, size_t *weightSize) |
---|---|
Python Function |
work_size, weight_size, ret = acl.mdl.query_size_from_mem(model, model_size) |
Function Usage |
Obtains the weight memory size and memory size required for model execution based on the model data in the memory. It is a synchronous interface. |
Input Description |
model: int, pointer object of the model data whose memory information needs to be obtained. model_size: int, model data length, in bytes. |
Return Value |
work_size: int, size of the working memory required for model execution, in bytes. weight_size: int, size of the weight memory required for model execution, in bytes. ret: int, error code.
|
Restrictions |
The execution and weight memory is on the device, which needs to be allocated and released by the user. |
Precautions |
None |
Function: set_dynamic_batch_size
C Prototype |
aclError aclmdlSetDynamicBatchSize(uint32_t modelId, aclmdlDataset *dataset, size_t index, uint64_t batchSize) |
---|---|
Python Function |
ret = acl.mdl.set_dynamic_batch_size(model_id, dataset, index, batch_size) |
Function Usage |
Sets dynamic batch sizes (number of images processed each batch) for model inference in dynamic batching scenarios. The batch size comes from the batch size choices specified by the dynamic_batch_size parameter during model conversion. For details about model conversion, see ATC Tool Instructions. You can also call aclmdlGetDynamicBatch to obtain the number of batch size choices as well as the batch size of each choice supported by the model. |
Input Description |
model_id: int, model ID.
dataset: int, pointer object of the input data of a model.
index: int, index of the input dynamic batch, obtained by calling acl.mdl.get_input_index_by_name. For dynamic batch and image size, the input name is fixed to ascend_mbatch_shape_data. For dynamic AIPP, the input name is fixed to ascend_dynamic_aipp_data. batch_size: int, batch size for model inference. |
Return Value |
ret: int, error code.
|
Restrictions |
None |
Precautions |
Call acl.mdl.get_input_index_by_name to obtain the index. For dynamic batch and image size, the input name is fixed to ascend_mbatch_shape_data. For dynamic AIPP, the input name is fixed to ascend_dynamic_aipp_data. |
Function: set_dynamic_hw_size
C Prototype |
aclError aclmdlSetDynamicHWSize(uint32_t modelId, aclmdlDataset *dataset, size_t index, uint64_t height, uint64_t width) |
---|---|
Python Function |
ret = acl.mdl.set_dynamic_hw_size(model_id, dataset, index, height, width) |
Function Usage |
Sets the height and width of the input image for model inference. The image size comes from the image size choices specified by the dynamic_image_size parameter during model conversion. For details about model conversion, see ATC Tool Instructions. You can also call acl.mdl.get_dynamic_hw to obtain the number of image size choices as well as the image size of each choice supported by the model. |
Input Description |
model_id: int, model ID.
dataset: input data of a model.
index: int, index of the input dynamic batch, obtained by calling acl.mdl.get_input_index_by_name. For dynamic batch and image size, the input name is fixed to ascend_mbatch_shape_data. For dynamic AIPP, the input name is fixed to ascend_dynamic_aipp_data. height: int, height to be set. width: int, width to be set. |
Return Value |
ret: int, error code.
|
Restrictions |
None |
Precautions |
Call acl.mdl.get_input_index_by_name to obtain the index. For dynamic batch and image size, the input name is fixed to ascend_mbatch_shape_data. For dynamic AIPP, the input name is fixed to ascend_dynamic_aipp_data. |
- Function: load_from_file
- Function: load_from_mem
- Function: load_from_file_with_mem
- Function: load_from_mem_with_mem
- Function: load_from_file_with_q
- Function: load_from_mem_with_q
- Function: execute
- Function: execute_async
- Function: unload
- Function: query_size
- Function: query_size_from_mem
- Function: set_dynamic_batch_size
- Function: set_dynamic_hw_size