BatchNorm3D¶
- class paddle.nn. BatchNorm3D ( num_features, momentum=0.9, epsilon=1e-05, weight_attr=None, bias_attr=None, data_format='NCDHW', use_global_stats=None, name=None ) [source]
-
Applies Batch Normalization over a 5D input (a mini-batch of 3D inputswith additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .
When use_global_stats = False, the \(\mu_{\beta}\) and \(\sigma_{\beta}^{2}\) are the statistics of one mini-batch. Calculated as follows:
\[\begin{split}\mu_{\beta} &\gets \frac{1}{m} \sum_{i=1}^{m} x_i \qquad &//\ \ mini-batch\ mean \\ \sigma_{\beta}^{2} &\gets \frac{1}{m} \sum_{i=1}^{m}(x_i - \ \mu_{\beta})^2 \qquad &//\ mini-batch\ variance \\\end{split}\]When use_global_stats = True, the \(\\mu_{\\beta}\) and \(\\sigma_{\\beta}^{2}\) are not the statistics of one mini-batch. They are global or running statistics (moving_mean and moving_variance). It usually got from the pre-trained model. Calculated as follows:
\[\begin{split}moving\_mean = moving\_mean * momentum + \mu_{\beta} * (1. - momentum) \quad &// global \ mean \\ moving\_variance = moving\_variance * momentum + \sigma_{\beta}^{2} * (1. - momentum) \quad &// global \ variance \\\end{split}\]The normalization function formula is as follows:
\[\begin{split}\hat{x_i} &\gets \frac{x_i - \mu_\beta} {\sqrt{\sigma_{\beta}^{2} + \epsilon}} \qquad &//\ normalize \\ y_i &\gets \gamma \hat{x_i} + \beta \qquad &//\ scale\ and\ shift\end{split}\]\(\epsilon\) : add a smaller value to the variance to prevent division by zero
\(\gamma\) : trainable proportional parameter
\(\beta\) : trainable deviation parameter
- Parameters
-
num_features (int) – Indicate the number of channels of the input
Tensor
.epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-5.
momentum (float, optional) – The value used for the moving_mean and moving_var computation. Default: 0.9.
weight_attr (ParamAttr|bool, optional) – The parameter attribute for Parameter scale of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm will create ParamAttr as weight_attr. If it is set to False, the weight is not learnable. If the Initializer of the weight_attr is not set, the parameter is initialized with ones. Default: None.
bias_attr (ParamAttr|bool, optional) – The parameter attribute for the bias of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm will create ParamAttr as bias_attr. If it is set to False, the weight is not learnable. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None.
data_format (str, optional) – Specify the input data format, the data format can be “NCDHW” or “NDHWC”, where N is batch size, C is the number of the feature map, D is the depth of the feature, H is the height of the feature map, W is the width of the feature map. Default: NCDHW.
use_global_stats (bool|None, optional) – Whether to use global mean and variance. If set to False, use the statistics of one mini-batch, if set to True, use the global statistics, if set to None, use global statistics in the test phase and use the statistics of one mini-batch in the training phase. Default: None.
name (str, optional) – Name for the BatchNorm, default is None. For more information, please refer to Name..
- Shape:
-
-
- x: 5-D tensor with shape: (batch, num_features, dims, height, weight) when data_format is “NCDHW”,
-
or (batch, dims, height, weight, num_features) when data_format is “NDHWC”.
output: 5-D tensor with same shape as input x.
-
- Returns
-
None
Examples
>>> import paddle >>> paddle.seed(100) >>> x = paddle.rand((2, 1, 2, 2, 3)) >>> batch_norm = paddle.nn.BatchNorm3D(1) >>> batch_norm_out = batch_norm(x) >>> print(batch_norm_out) Tensor(shape=[2, 1, 2, 2, 3], dtype=float32, place=Place(cpu), stop_gradient=False, [[[[[ 0.28011751, -0.95211101, -1.64757574], [ 0.14573872, -0.39522290, -0.76082933]], [[-1.01646376, 0.31086648, -1.66019011], [ 1.08991623, -0.54664266, 1.53283834]]]], [[[[ 1.33958006, 1.71585774, -0.12862551], [-0.66051245, 1.32629418, -0.06402326]], [[-0.28699064, 0.87359405, 0.42558217], [-0.46636176, 1.09858704, -1.55342245]]]]])