KLDivLoss¶

class paddle.nn. KLDivLoss ( reduction: _ReduceMode = 'mean', log_target: bool = False ) [source]

Generate a callable object of ‘KLDivLoss’ to calculate the Kullback-Leibler divergence loss between Input(X) and Input(Target). Notes that Input(X) is the log-probability and Input(Target) is the probability.

KL divergence loss is calculated as follows:

If log_target is False:

$$l(x, y) = y * (log(y) - x)$$

If log_target is True:

$$l(x, y) = exp(y) * (y - x)$$

Here $x$ is input and $y$ is label.

If reduction is 'none', the output loss is the same shape as the input, and the loss at each point is calculated separately. There is no reduction to the result.

If reduction is 'mean', the output loss is the shape of [], and the output is the average of all losses.

If reduction is 'sum', the output loss is the shape of [], and the output is the sum of all losses.

If reduction is 'batchmean', the output loss is the shape of [N], N is the batch size, and the output is the sum of all losses divided by the batch size.

Parameters

reduction (str, optional) – Indicate how to average the loss, the candidates are 'none' | 'batchmean' | 'mean' | 'sum'. If reduction is 'mean', the reduced mean loss is returned; If reduction is 'batchmean', the sum loss divided by batch size is returned; if reduction is 'sum', the reduced sum loss is returned; if reduction is 'none', no reduction will be applied. Default is 'mean'.
log_target (bool, optional) – Indicate whether label is passed in log space. Default is False.

Shape:

input (Tensor): (N, *), where * means, any number of additional dimensions.

label (Tensor): (N, *), same shape as input.

output (Tensor): tensor with shape: [] by default.

Examples

>>> import paddle
>>> import paddle.nn as nn

>>> shape = (5, 20)
>>> x = paddle.uniform(shape, min=-10, max=10).astype('float32')
>>> target = paddle.uniform(shape, min=-10, max=10).astype('float32')

>>> # 'batchmean' reduction, loss shape will be []
>>> kldiv_criterion = nn.KLDivLoss(reduction='batchmean')
>>> pred_loss = kldiv_criterion(x, target)
>>> print(pred_loss.shape)
[]

>>> # 'mean' reduction, loss shape will be []
>>> kldiv_criterion = nn.KLDivLoss(reduction='mean')
>>> pred_loss = kldiv_criterion(x, target)
>>> print(pred_loss.shape)
[]

>>> # 'sum' reduction, loss shape will be []
>>> kldiv_criterion = nn.KLDivLoss(reduction='sum')
>>> pred_loss = kldiv_criterion(x, target)
>>> print(pred_loss.shape)
[]

>>> # 'none' reduction, loss shape is same with X shape
>>> kldiv_criterion = nn.KLDivLoss(reduction='none')
>>> pred_loss = kldiv_criterion(x, target)
>>> print(pred_loss.shape)
[5, 20]

>>> # if label is in the log space, set log_target = True
>>> target = paddle.uniform(shape, min=0, max=10).astype('float32')
>>> log_target = paddle.log(target)
>>> kldiv_criterion_1 = nn.KLDivLoss(reduction='none')
>>> kldiv_criterion_2 = nn.KLDivLoss(reduction='none', log_target=True)
>>> pred_loss_1 = kldiv_criterion_1(x, target)
>>> pred_loss_2 = kldiv_criterion_2(x, log_target)
>>> print(paddle.allclose(pred_loss_1, pred_loss_2))
Tensor(shape=[], dtype=bool, place=Place(cpu), stop_gradient=True,
True)

forward ( input: Tensor, label: Tensor ) → Tensor forward¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments