Mixed Precision
Overview
Mixed precision is the combined use of the float16 and float32 data types in training deep neural networks, which reduces memory usage and access frequency. Mixed precision training makes it easier to deploy larger networks without compromising the network accuracy with float32. Currently, the Ascend AI Processor supports the following training precision modes. Choose one as needed in the training script.
- allow_fp32_to_fp16: The original precision is preferentially retained. If an operator does not support the float32 data type, the float16 precision is used. Currently, the float32 type is not supported by convolution operators, such as Conv2D and DepthwiseConv2D. These operators are precision-insensitive and do not reduce the accuracy of the entire network. This is the default precision mode.
- force_fp16: If an operator supports both float16 and float32 data types, float16 is forcibly selected.
- must_keep_origin_dtype: The original precision is retained. In this mode, if a network contains a Conv2D operator, the training process will be interrupted when the input type of the original graph is float32, because the operator supports only the float16 type.
- allow_mix_precision: Mixed precision is allowed. For operators of the float32 data type on a network, the precision of some float32 operators can be automatically reduced to float16 based on the built-in optimization policy. In this way, the system performance is improved and the memory usage is reduced with little accuracy loss. Note that the Ascend AI Processor supports only float32 to float16 casting. It is recommended that Loss Scaling be enabled after mixed precision is enabled to compensate for the accuracy loss caused by precision reduction. If the original script has implemented the manual mixed precision, for example, the cast operator is explicitly called to convert the compute precision, mixed precision of the Ascend AI Processor does not need to be enabled.
Setting Precision Mode with Estimator
from npu_bridge.estimator.npu.npu_config import NPURunConfig from npu_bridge.estimator import npu_ops npu_config=NPURunConfig( model_dir=FLAGS.model_dir, save_checkpoints_steps=FLAGS.save_checkpoints_steps, session_config=tf.ConfigProto(allow_soft_placement=True,log_device_placement=False), precision_mode="allow_mix_precision" )
- Go to the /opp/op_impl/built-in/ai_core/tbe/config/ascend910 in the OPP installation path.
- Grant the write permission on the aic-ascend910-ops-info.json file.
chmod u+w aic-ascend910-ops-info.json
- Modify or add the precision_reduce field of the corresponding operator in the aic-ascend910-ops-info.json file of the operator information library.
"precision_reduce":{ "flag":"true" }
- true: allows precision reduction to float16 for operators supporting float32.
- false: forbids precision reduction to float16 for operators supporting float32.
- If not specified, the current operator uses the same mixed precision processing as the upstream operator.
Setting Precision Mode with sess.run()
import tensorflow as tf from npu_bridge.estimator import npu_ops from tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig config = tf.ConfigProto() custom_op = config.graph_options.rewrite_options.custom_optimizers.add() custom_op.name = "NpuOptimizer" custom_op.parameter_map["use_off_line"].b = True custom_op.parameter_map["precision_mode"].s = tf.compat.as_bytes("allow_mix_precision") config.graph_options.rewrite_options.remapping = RewriterConfig.OFF # Disable remapping. with tf.Session(config=config) as sess: print(sess.run(cost))
The parameter description and configuration method are the same as those in the Estimator method.