rnnt_loss¶
- paddle.nn.functional. rnnt_loss ( input, label, input_lengths, label_lengths, blank=0, fastemit_lambda=0.001, reduction='mean', name=None ) [source]
-
An operator integrating the open source Warp-Transducer library (https://github.com/b-flo/warp-transducer.git) to compute Sequence Transduction with Recurrent Neural Networks (RNN-T) loss.
- Parameters
-
input (Tensor) – The logprobs sequence with padding, which is a 4-D Tensor. The tensor shape is [B, Tmax, Umax, D], where Tmax is the longest length of input logit sequence. The data type should be float32 or float64.
label (Tensor) – The ground truth sequence with padding, which must be a 2-D Tensor. The tensor shape is [B, Umax], where Umax is the longest length of label sequence. The data type must be int32.
input_lengths (Tensor) – The length for each input sequence, it should have shape [batch_size] and dtype int64.
label_lengths (Tensor) – The length for each label sequence, it should have shape [batch_size] and dtype int64.
blank (int, optional) – The blank label index of RNN-T loss, which is in the half-opened interval [0, B). The data type must be int32. Default is 0.
fastemit_lambda (float, default 0.001) – Regularization parameter for FastEmit (https://arxiv.org/pdf/2010.11148.pdf)
reduction (string, optional) – Indicate how to average the loss, the candicates are
'none'
|'mean'
|'sum'
. Ifreduction
is'mean'
, the output will be sum of loss and be divided by the batch_size; Ifreduction
is'sum'
, return the sum of loss; Ifreduction
is'none'
, no reduction will be applied. Default is'mean'
.name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.
- Returns
-
reduction is
'none'
, the shape of loss is [batch_size], otherwise, the shape of loss is []. Data type is the same aslogprobs
. - Return type
-
Tensor, The RNN-T loss between
logprobs
andlabels
. If attr
Examples
# declarative mode import paddle.nn.functional as F import numpy as np import paddle import functools fn = functools.partial(F.rnnt_loss, reduction='sum', fastemit_lambda=0.0, blank=0) acts = np.array([[[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1], [0.1, 0.1, 0.2, 0.8, 0.1]], [[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.2, 0.1, 0.1], [0.7, 0.1, 0.2, 0.1, 0.1]]]]) labels = [[1, 2]] acts = paddle.to_tensor(acts, stop_gradient=False) lengths = [acts.shape[1]] * acts.shape[0] label_lengths = [len(l) for l in labels] labels = paddle.to_tensor(labels, paddle.int32) lengths = paddle.to_tensor(lengths, paddle.int32) label_lengths = paddle.to_tensor(label_lengths, paddle.int32) costs = fn(acts, labels, lengths, label_lengths) print(costs) # Tensor(shape=[], dtype=float64, place=Place(gpu:0), stop_gradient=False, # 4.49566677)