KLDivLoss¶
- class paddle.nn. KLDivLoss ( reduction='mean', log_target=False ) [source]
-
Generate a callable object of ‘KLDivLoss’ to calculate the Kullback-Leibler divergence loss between Input(X) and Input(Target). Notes that Input(X) is the log-probability and Input(Target) is the probability.
KL divergence loss is calculated as follows:
If log_target is False:
$$l(x, y) = y * (log(y) - x)$$
If log_target is True:
$$l(x, y) = exp(y) * (y - x)$$
Here \(x\) is input and \(y\) is label.
If reduction is
'none'
, the output loss is the same shape as the input, and the loss at each point is calculated separately. There is no reduction to the result.If reduction is
'mean'
, the output loss is the shape of [], and the output is the average of all losses.If reduction is
'sum'
, the output loss is the shape of [], and the output is the sum of all losses.If reduction is
'batchmean'
, the output loss is the shape of [N], N is the batch size, and the output is the sum of all losses divided by the batch size.- Parameters
-
reduction (str, optional) – Indicate how to average the loss, the candidates are
'none'
|'batchmean'
|'mean'
|'sum'
. If reduction is'mean'
, the reduced mean loss is returned; If reduction is'batchmean'
, the sum loss divided by batch size is returned; if reduction is'sum'
, the reduced sum loss is returned; if reduction is'none'
, no reduction will be applied. Default is'mean'
.log_target (bool, optional) – Indicate whether label is passed in log space. Default is False.
Shape:
input (Tensor):
(N, *)
, where*
means, any number of additional dimensions.label (Tensor):
(N, *)
, same shape as input.output (Tensor): tensor with shape: [] by default.
Examples
>>> import paddle >>> import paddle.nn as nn >>> shape = (5, 20) >>> x = paddle.uniform(shape, min=-10, max=10).astype('float32') >>> target = paddle.uniform(shape, min=-10, max=10).astype('float32') >>> # 'batchmean' reduction, loss shape will be [] >>> kldiv_criterion = nn.KLDivLoss(reduction='batchmean') >>> pred_loss = kldiv_criterion(x, target) >>> print(pred_loss.shape) [] >>> # 'mean' reduction, loss shape will be [] >>> kldiv_criterion = nn.KLDivLoss(reduction='mean') >>> pred_loss = kldiv_criterion(x, target) >>> print(pred_loss.shape) [] >>> # 'sum' reduction, loss shape will be [] >>> kldiv_criterion = nn.KLDivLoss(reduction='sum') >>> pred_loss = kldiv_criterion(x, target) >>> print(pred_loss.shape) [] >>> # 'none' reduction, loss shape is same with X shape >>> kldiv_criterion = nn.KLDivLoss(reduction='none') >>> pred_loss = kldiv_criterion(x, target) >>> print(pred_loss.shape) [5, 20] >>> # if label is in the log space, set log_target = True >>> target = paddle.uniform(shape, min=0, max=10).astype('float32') >>> log_target = paddle.log(target) >>> kldiv_criterion_1 = nn.KLDivLoss(reduction='none') >>> kldiv_criterion_2 = nn.KLDivLoss(reduction='none', log_target=True) >>> pred_loss_1 = kldiv_criterion_1(x, target) >>> pred_loss_2 = kldiv_criterion_2(x, log_target) >>> print(paddle.allclose(pred_loss_1, pred_loss_2)) Tensor(shape=[], dtype=bool, place=Place(cpu), stop_gradient=True, True)
-
forward
(
input,
label
)
forward¶
-
Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
-
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments