ClipGradByNorm¶
- class paddle.nn. ClipGradByNorm ( clip_norm ) [source]
-
Limit the l2 norm of multi-dimensional Tensor \(X\) to
clip_norm
.If the l2 norm of \(X\) is greater than
clip_norm
, \(X\) will be compressed by a ratio.If the l2 norm of \(X\) is less than or equal to
clip_norm
, nothing will be done.
The multidimensional Tensor \(X\) is not passed from this class, but the gradients of all parameters set in
optimizer
. Ifneed_clip
of specific param isFalse
in itsParamAttr
, then the gradients of this param will not be clipped.Gradient clip will takes effect after being set in
optimizer
, see the documentoptimizer
(for example: SGD).The clipping formula is:
\[\begin{split}Out = \left\{ \begin{array}{ccl} X & & if (norm(X) \leq clip\_norm) \\ \frac{clip\_norm*X}{norm(X)} & & if (norm(X) > clip\_norm) \\ \end{array} \right.\end{split}\]where \(norm(X)\) represents the L2 norm of \(X\).
\[norm(X) = ( \sum_{i=1}^{n}|x\_i|^2)^{ \frac{1}{2}}\]Note
need_clip
ofClipGradByNorm
HAS BEEN DEPRECATED since 2.0. Please useneed_clip
inParamAttr
to speficiy the clip scope.- Parameters
-
clip_norm (float) – The maximum norm value.
Examples
>>> import paddle >>> x = paddle.uniform([10, 10], min=-1.0, max=1.0, dtype='float32') >>> linear = paddle.nn.Linear(in_features=10, out_features=10, ... weight_attr=paddle.ParamAttr(need_clip=True), ... bias_attr=paddle.ParamAttr(need_clip=False)) >>> out = linear(x) >>> loss = paddle.mean(out) >>> loss.backward() >>> clip = paddle.nn.ClipGradByNorm(clip_norm=1.0) >>> sdg = paddle.optimizer.SGD(learning_rate=0.1, parameters=linear.parameters(), grad_clip=clip) >>> sdg.step()