kl_div

paddle.nn.functional. kl_div ( input, label, reduction='mean', name=None ) [source]

This operator calculates the Kullback-Leibler divergence loss between Input(X) and Input(Target). Notes that Input(X) is the log-probability and Input(Target) is the probability.

KL divergence loss is calculated as follows:

$$l(x, y) = y * (log(y) - x)$$

While \(x\) is input and \(y\) is label.

While reduction is none, output loss is in the same shape as input, loss in each point is calculated seperately and no reduction is applied.

While reduction is mean, output loss is in shape of [1] and loss value is the mean value of all losses.

While reduction is sum, output loss is in shape of [1] and loss value is the sum value of all losses.

While reduction is batchmean, output loss is in shape of [1] and loss value is the sum value of all losses divided by batch size.

Parameters
  • input (Tensor) – The input tensor. The shapes is [N, *], where N is batch size and * means any number of additional dimensions. It’s data type should be float32, float64.

  • label (Tensor) – label. The shapes is [N, *], same shape as input . It’s data type should be float32, float64.

  • reduction (Tensor) – Indicate how to average the loss, the candicates are 'none' | 'batchmean' | 'mean' | 'sum'. If reduction is 'mean', the reduced mean loss is returned; If reduction is 'batchmean', the sum loss divided by batch size is returned; if reduction is 'sum', the reduced sum loss is returned; if reduction is 'none', no reduction will be apllied. Default is 'mean'.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

The KL divergence loss. The data type is same as input tensor

Return type

Tensor

Examples

import paddle
import numpy as np
import paddle.nn.functional as F

shape = (5, 20)
input = np.random.uniform(-10, 10, shape).astype('float32')
target = np.random.uniform(-10, 10, shape).astype('float32')

# 'batchmean' reduction, loss shape will be [1]
pred_loss = F.kl_div(paddle.to_tensor(input),
                     paddle.to_tensor(target), reduction='batchmean')
# shape=[1]

# 'mean' reduction, loss shape will be [1]
pred_loss = F.kl_div(paddle.to_tensor(input),
                     paddle.to_tensor(target), reduction='mean')
# shape=[1]

# 'sum' reduction, loss shape will be [1]
pred_loss = F.kl_div(paddle.to_tensor(input),
                     paddle.to_tensor(target), reduction='sum')
# shape=[1]

# 'none' reduction, loss shape is same with input shape
pred_loss = F.kl_div(paddle.to_tensor(input),
                     paddle.to_tensor(target), reduction='none')
# shape=[5, 20]