batch_norm¶
- paddle.static.nn. batch_norm ( input, act=None, is_test=False, momentum=0.9, epsilon=1e-05, param_attr=None, bias_attr=None, data_layout='NCHW', in_place=False, name=None, moving_mean_name=None, moving_variance_name=None, do_model_average_for_mean_and_var=True, use_global_stats=False ) [source]
-
Batch Normalization Layer
Can be used as a normalizer function for convolution or fully_connected operations. The required data format for this layer is one of the following:
NHWC [batch, in_height, in_width, in_channels]
NCHW [batch, in_channels, in_height, in_width]
Refer to Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift for more details.
:math:input is the input features over a mini-batch.
\[ \begin{align}\begin{aligned}\begin{split}\\mu_{\\beta} &\\gets \\frac{1}{m} \\sum_{i=1}^{m} x_i \\qquad &//\\ \ mini-batch\ mean \\\\ \\sigma_{\\beta}^{2} &\\gets \\frac{1}{m} \\sum_{i=1}^{m}(x_i - \\ \\mu_{\\beta})^2 \\qquad &//\ mini-batch\ variance \\\\ \\hat{x_i} &\\gets \\frac{x_i - \\mu_\\beta} {\\sqrt{\\ \\sigma_{\\beta}^{2} + \\epsilon}} \\qquad &//\ normalize \\\\ y_i &\\gets \\gamma \\hat{x_i} + \\beta \\qquad &//\ scale\ and\ shift\end{split}\\\begin{split}moving\_mean = moving\_mean * momentum + mini-batch\_mean * (1. - momentum) \\\\ moving\_var = moving\_var * momentum + mini-batch\_var * (1. - momentum)\end{split}\end{aligned}\end{align} \]moving_mean is global mean and moving_var is global variance.
When use_global_stats = True, the \(\\mu_{\\beta}\) and \(\\sigma_{\\beta}^{2}\) are not the statistics of one mini-batch. They are global (or running) statistics. (It usually got from the pre-trained model.) The training and testing (or inference) have the same behavior:
\[\begin{split}\\hat{x_i} &\\gets \\frac{x_i - \\mu_\\beta} {\\sqrt{\\ \\sigma_{\\beta}^{2} + \\epsilon}} \\\\ y_i &\\gets \\gamma \\hat{x_i} + \\beta\end{split}\]Note
if build_strategy.sync_batch_norm=True, the batch_norm in network will use sync_batch_norm automatically. is_test = True can only be used in test program and inference program, is_test CANNOT be set to True in train program, if you want to use global status from pre_train model in train program, please set use_global_stats = True.
- Parameters
-
input (Tensor) – The rank of input Tensor can be 2, 3, 4, 5. The data type is float16 or float32 or float64.
act (string, Default None) – Activation type, linear|relu|prelu|…
is_test (bool, Default False) – A flag indicating whether it is in test phrase or not.
momentum (float|Tensor, Default 0.9) – The value used for the moving_mean and moving_var computation. This should be a float number or a Tensor with shape [1] and data type as float32. The updated formula is: \(moving\_mean = moving\_mean * momentum + new\_mean * (1. - momentum)\) \(moving\_var = moving\_var * momentum + new\_var * (1. - momentum)\) Default is 0.9.
epsilon (float, Default 1e-05) – A value added to the denominator for numerical stability. Default is 1e-5.
param_attr (ParamAttr|None) –
- The parameter attribute for Parameter scale
-
of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm
will create ParamAttr as param_attr, the name of scale can be set in ParamAttr. If the Initializer of the param_attr is not set, the parameter is initialized with Xavier. Default: None.
bias_attr (ParamAttr|None) –
- The parameter attribute for the bias of batch_norm.
-
If it is set to None or one attribute of ParamAttr, batch_norm
will create ParamAttr as bias_attr, the name of bias can be set in ParamAttr. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None.
data_layout (str, optional) – Specify the data format of the input, and the data format of the output will be consistent with that of the input. An optional string from: “NCHW”, “NHWC”. The default is “NCHW”. When it is “NCHW”, the data is stored in the order of: [batch_size, input_channels, input_height, input_width].
in_place (bool, Default False) – Make the input and output of batch norm reuse memory.
name (str|None) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
moving_mean_name (str, Default None) – The name of moving_mean which store the global Mean. If it is set to None, batch_norm will save global mean with a random name, otherwise, batch_norm will save global mean with the string.
moving_variance_name (str, Default None) – The name of the moving_variance which store the global Variance. If it is set to None, batch_norm will save global variance with a random name, otherwise, batch_norm will save global variance with the string.
do_model_average_for_mean_and_var (bool, Default True) – Whether parameter mean and variance should do model average when model average is enabled.
use_global_stats (bool, Default False) – Whether to use global mean and variance. In inference or test mode, set use_global_stats to true or is_test to true, and the behavior is equivalent. In train mode, when setting use_global_stats True, the global mean and variance are also used during train period.
- Returns
-
A Tensor which is the result after applying batch normalization on the input, has same shape and data type with input.
Examples
import paddle paddle.enable_static() x = paddle.static.data(name='x', shape=[3, 7, 3, 7], dtype='float32') hidden1 = paddle.static.nn.fc(x=x, size=200) print(hidden1.shape) # [3, 200] hidden2 = paddle.static.nn.batch_norm(input=hidden1) print(hidden2.shape) # [3, 200]