WeightNormParamAttr

class paddle.static. WeightNormParamAttr ( dim: int | None = None, name: str | None = None, initializer: Initializer | None = None, learning_rate: float = 1.0, regularizer: WeightDecayRegularizer | None = None, trainable: bool = True, do_model_average: bool = False, need_clip: bool = True ) [source]

Note

Please use ‘paddle.nn.utils.weight_norm’ in dygraph mode.

Note

gradient_clip of ParamAttr HAS BEEN DEPRECATED since 2.0. Please use need_clip in ParamAttr to specify the clip scope. There are three clipping strategies: ClipGradByGlobalNorm , ClipGradByNorm , ClipGradByValue .

Parameter of weight Norm. Weight Norm is a reparameterization of the weight vectors in a neural network that decouples the magnitude of those weight vectors from their direction. Weight Norm has been implemented as discussed in this paper: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks.

Parameters

dim (int, optional) – Dimension over which to compute the norm. Dim is a non-negative number which is less than the rank of weight Tensor. For Example, dim can be chosen from 0, 1, 2, 3 for convolution whose weight shape is [cout, cin, kh, kw] and rank is 4. Default None, meaning that all elements will be normalized.
name (str, optional) – The parameter’s name. Default None, meaning that the name would be created automatically. Please refer to Name for more details.
initializer (Initializer, optional) – The method to initialize this parameter, such as initializer = paddle.nn.initializer.Constant(1.0). Default None, meaning that the weight parameter is initialized by Xavier initializer, and the bias parameter is initialized by 0.
learning_rate (float32, optional) – The parameter’s learning rate when optimizer is \(global\_lr * parameter\_lr * scheduler\_factor\). Default 1.0.
regularizer (WeightDecayRegularizer, optional) – Regularization strategy. There are two method: L1Decay , L2Decay. If regularizer is also set in optimizer (such as SGD ), that regularizer setting in optimizer will be ignored. Default None, meaning there is no regularization.
trainable (bool, optional) – Whether this parameter is trainable. Default True.
do_model_average (bool, optional) – Whether this parameter should do model average. Default False.
need_clip (bool, optional) – Whether the parameter gradient need to be clipped in optimizer. Default is True.

Examples

>>> import paddle

>>> paddle.enable_static()

>>> data = paddle.static.data(name="data", shape=[3, 32, 32], dtype="float32")

>>> fc = paddle.static.nn.fc(
...     x=data,
...     size=1000,
...     weight_attr=paddle.static.WeightNormParamAttr(
...         dim=None,
...         name='weight_norm_param',
...         initializer=paddle.nn.initializer.Constant(1.0),
...         learning_rate=1.0,
...         regularizer=paddle.regularizer.L2Decay(0.1),
...         trainable=True,
...         do_model_average=False,
...         need_clip=True,
...     ),
... )