gaussian_nll_loss¶
- paddle.nn.functional. gaussian_nll_loss ( input, label, variance, full=False, epsilon=1e-06, reduction='mean', name=None ) [source]
-
Gaussian negative log likelihood loss.
Gaussian negative log likelihood loss among
input
,variance
andlabel
. Note that thelabel
is treated as samples from Gaussian distributions. This function is used to train a neural network predicts theinput
andvariance
of a gaussian distribution thatlabel
are supposed to be coming from. This meansinput
andvariance
should be functions(the neural network) of some inputs.For a
label
having Gaussian distribution withinput
andvariance
predicted by neural network the loss is calculated as follows:\[\text{loss} = \frac{1}{2}\left(\log\left(\text{max}\left(\text{var}, \ \text{epsilon}\right)\right) + \frac{\left(\text{input} - \text{label}\right)^2} {\text{max}\left(\text{var}, \ \text{epsilon}\right)}\right) + \text{const.}\]where
epsilon
is used for stability. By default, the constant term of the loss function is omitted unlessfull
isTrue
. Ifvariance
is not the same size asinput
(due to a homoscedastic assumption), it must either have a final dimension of 1 or have one fewer dimension (with all other sizes being the same) for correct broadcasting.- Parameters
-
input (Tensor) – input tensor, \((N, *)\) or \((*)\) where \(*\) means any number of additional dimensions. Expectation of the Gaussian distribution, available dtype is float32, float64.
label (Tensor) – target label tensor, \((N, *)\) or \((*)\), same shape as the input, or same shape as the input but with one dimension equal to 1 (to allow for broadcasting). Sample from the Gaussian distribution, available dtype is float32, float64.
variance (Tensor) – tensor of positive variance(s), \((N, *)\) or \((*)\), same shape as the input, or same shape as the input but with one dimension equal to 1, or same shape as the input but with one fewer dimension (to allow for broadcasting). One for each of the expectations in the input (heteroscedastic), or a single one (homoscedastic), available dtype is float32, float64.
full (bool, optional) – include the constant term in the loss calculation. Default:
False
.epsilon (float, optional) – value used to clamp
variance
(see note below), for stability. Default: 1e-6.reduction (str, optional) – specifies the reduction to apply to the output:
'none'
|'mean'
|'sum'
.'none'
: no reduction will be applied,'mean'
: the output is the average of all batch member losses,'sum'
: the output is the sum of all batch member losses. Default:'mean'
.name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.
- Returns
-
If
reduction
is'none'
, the shape of output is same asinput
, else the shape of output is []. - Return type
-
output (Tensor)
- Examples::
-
>>> import paddle >>> import paddle.nn.functional as F >>> paddle.seed(2023) >>> input = paddle.randn([5, 2], dtype=paddle.float32) >>> label = paddle.randn([5, 2], dtype=paddle.float32) >>> variance = paddle.ones([5, 2], dtype=paddle.float32) >>> loss = F.gaussian_nll_loss(input, label, variance, reduction='none') >>> print(loss) Tensor(shape=[5, 2], dtype=float32, place=Place(cpu), stop_gradient=True, [[0.21808575, 1.43013096], [1.05245590, 0.00394560], [1.20861185, 0.00000062], [0.56946373, 0.73300570], [0.37142906, 0.12038800]]) >>> loss = F.gaussian_nll_loss(input, label, variance, reduction='mean') >>> print(loss) Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True, 0.57075173)
Note
The clamping of
variance
is ignored with respect to autograd, and so the gradients are unaffected by it.