rnnt_loss¶

paddle.nn.functional. rnnt_loss ( input, label, input_lengths, label_lengths, blank=0, fastemit_lambda=0.001, reduction='mean', name=None ) [source]

An operator integrating the open source Warp-Transducer library (https://github.com/b-flo/warp-transducer.git) to compute Sequence Transduction with Recurrent Neural Networks (RNN-T) loss.

Parameters

input (Tensor) – The logprobs sequence with padding, which is a 4-D Tensor. The tensor shape is [B, Tmax, Umax, D], where Tmax is the longest length of input logit sequence. The data type should be float32 or float64.
label (Tensor) – The ground truth sequence with padding, which must be a 2-D Tensor. The tensor shape is [B, Umax], where Umax is the longest length of label sequence. The data type must be int32.
input_lengths (Tensor) – The length for each input sequence, it should have shape [batch_size] and dtype int64.
label_lengths (Tensor) – The length for each label sequence, it should have shape [batch_size] and dtype int64.
blank (int, optional) – The blank label index of RNN-T loss, which is in the half-opened interval [0, B). The data type must be int32. Default is 0.
fastemit_lambda (float, default 0.001) – Regularization parameter for FastEmit (https://arxiv.org/pdf/2010.11148.pdf)
reduction (string, optional) – Indicate how to average the loss, the candicates are 'none' | 'mean' | 'sum'. If reduction is 'mean', the output will be sum of loss and be divided by the batch_size; If reduction is 'sum', return the sum of loss; If reduction is 'none', no reduction will be applied. Default is 'mean'.
name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

reduction is 'none', the shape of loss is [batch_size], otherwise, the shape of loss is []. Data type is the same as logprobs.

Return type

Tensor, The RNN-T loss between logprobs and labels. If attr

Examples

# declarative mode
import paddle.nn.functional as F
import numpy as np
import paddle
import functools

fn = functools.partial(F.rnnt_loss, reduction='sum', fastemit_lambda=0.0, blank=0)

acts = np.array([[[[0.1, 0.6, 0.1, 0.1, 0.1],
                [0.1, 0.1, 0.6, 0.1, 0.1],
                [0.1, 0.1, 0.2, 0.8, 0.1]],
                [[0.1, 0.6, 0.1, 0.1, 0.1],
                [0.1, 0.1, 0.2, 0.1, 0.1],
                [0.7, 0.1, 0.2, 0.1, 0.1]]]])
labels = [[1, 2]]

acts = paddle.to_tensor(acts, stop_gradient=False)

lengths = [acts.shape[1]] * acts.shape[0]
label_lengths = [len(l) for l in labels]
labels = paddle.to_tensor(labels, paddle.int32)
lengths = paddle.to_tensor(lengths, paddle.int32)
label_lengths = paddle.to_tensor(label_lengths, paddle.int32)

costs = fn(acts, labels, lengths, label_lengths)
print(costs)
# Tensor(shape=[], dtype=float64, place=Place(gpu:0), stop_gradient=False,
#        4.49566677)