Scheduling and Building
As shown in the following code, after the computation logic is defined, you need to implement scheduling and building in the operator API implementation function.
Call the auto_schedule API to automatically generate the corresponding scheduling. You can view the IR of the corresponding computation with the TVM printing mechanism. The configuration information includes the IR print switch status, build switch status, operator name in the kernel, and input and output tensors.
with tvm.target.cce(): schedule = generic.auto_schedule(result) config = { "print_ir": True, "need_build": True, "name": kernel_name, "tensor_list": [input_data, result] "bool_storage_as_bit":True } te.lang.cce.cce_build_code(schedule, config)
- Use the auto_schedule interface of generic to perform auto scheduling (defining schedule). The argument of the auto_schedule interface is the output tensor of the operator.
The schedule object defines how to efficiently execute the described computation process on hardware. That is, the related computations are mapped to the corresponding instructions on a hardware device. A schedule object contains an IR, which uses code similar to pseudo code to describe a computation process. You can print the object by using the parameter print_ir.
- need_build: whether build is performed. The default value is True.
- name: name of the operator binary file generated after build. The name can contain a maximum of 200 characters, which can be uppercase letters, lowercase letters, digits, and underscores (_), and must start with a letter or underscore (_).
- tensor_list: a list of the input and output tensors. The input and output tensors must be arranged in the input and output sequences of the operator.
Note: The input tensor must be the tensor object returned by the placeholder API. The memory address of the tensor object cannot be overwritten.
For example: "tensor_list": [tensor_a, tensor_b, res], where, tensor_a and tensor_b are the input tensors, and res is the output tensor.
- bool_storage_as_1bit: whether to store bools in 1-bit format. Value True indicates 1-bit storage; value False indicates 8-bit storage. The default value is True.
When the mode argument passed to the te.lang.cce.vcmp(lhs, rhs, op, mode) call is bool, set this parameter to False.
- The cce_build_code API provided by te.lang.cce is used to build the operator based on scheduling and configuration. During operator building when OMG converts the model, a dedicated kernel is built based on the input data shape, type, and operator parameters.
- schedule: schedule object to be computed by the generated operator.
- config: map configured by compilation parameters.
After the build is complete, the operator binary file (.o format) and operator description file (.json format) are generated.