BatchNormTrainingBackprop
BatchNormTrainingBackprop#
Versioned name: BatchNormTrainingBackprop-1
Category: Normalization
Short description: BatchNormTrainingBackprop computes gradient for batch normalization.
Attributes:
epsilon
Description: epsilon is the number to be added to the variance to avoid division by zero when normalizing a value. For example, epsilon equal to 0.001 means that 0.001 is added to the variance.
Range of values: arbitrary positive f32 value
Type: f32
Required: yes
data_format
Description: data_format denotes the data format of the input, output_delta and input_delta.
Range of values: NXC or NCX (X means HW for 2D, DHW for 3D)
Type: string
Default value: NXC
Required: no
Inputs
1:
input_forward
- original input tensor of BatchNormForwardTraining op. Required.Type: T1
2:
output_delta
- the gradient with respect to output. Required.Type: T1
3:
mean
- batch mean. A 1D tensor with the same span as input’s channel axis. Required.Type: T2
4:
variance
- batch variance. A 1D tensor with the same span as input’s channel axis. Required.Type: T2
5:
gamma
- gamma scaling for normalized value. A 1D tensor with the same span as input’s channel axis. Optional.Type: T2
Outputs
1:
input_delta
- the the gradient tensor with respect to the output of the batch normalization.Type: T1
2:
gamma_delta
- the the gradient tensor with respect to the gamma of the batch normalization. Optional.Type: T2
3:
beta_delta
- the the gradient tensor with respect to the beta of the batch normalization. Optional.Type: T2
Types
T1: f32, f16, bf16.
T2: f32, bf16.
Constraints: T2 can be bf16 only when T1 is bf16.