adaptive_log_softmax_with_loss¶
- paddle.nn.functional. adaptive_log_softmax_with_loss ( input, label, head_weight, tail_weights, cutoffs, head_bias=None, name=None ) [source]
-
Compute adaptive logsoftmax result and negative log likelihood between
input
andlabel
. Parameterhead
,tail_weights
,cutoffs
are inner members of AdaptiveLogSoftmaxWithLoss Please refer to AdaptiveLogSoftmaxWithLoss.- Parameters
-
input (Tensor) – Input tensor, the data type should be float32 or float64.
label (Tensor) – Label tensor, the data type should be float32 or float64.
head_weight (Tensor) – weight tensor for linear computation, the data type should be float32 or float64, the shape should be
[input.shape[1], shortlist_size + n_clusters]
, whereshortlist_size
is the first element in the cutoffs list, andn_clusters
is the length of the cutoffs list minus 1.tail_weights (list[Tensor]) – weight tensor list for linear computation, the data type should be float32 or float64. The number of elements in the tail_weights depends on the value of the n_clusters, and each element contains the weights of two linear layers, their dimensions are
[input.shape[1], hsz]
and[hsz, osz]
, wherehsz
is the number of input features in_features divided by div_value to the power(i + 1)
, where i is the cyclic variable, from0
ton_clusters - 1
, andosz
is the(i + 1)
The difference between the cutoff and the ith cutoff.cutoffs (Sequence) – Cutoffs used to assign targets to their buckets.
head_bias (Tensor, optional) – bias tensor for linear computation, the data type should be float32 or float64. Default:
None
.name (str, optional) – Name for the operation (optional, default is
None
). For more information, please refer to Name.
- Returns
-
output (Tensor). The tensor sotring adaptive logsoftmax result, the shape of output is
[N]
loss (Tensor). The tensor variable storing the adaptive_log_softmax_loss of input and label.
Examples
>>> import paddle >>> import paddle.nn.functional as F >>> paddle.seed(2024) >>> input = paddle.randn([3, 5], dtype=paddle.float32) >>> head_weight = paddle.randn([5, 3], dtype=paddle.float32) >>> head_bias = paddle.randn([3], dtype=paddle.float32) >>> tail_weights = [] >>> tail_weights.append(paddle.randn([5, 2], dtype=paddle.float32)) >>> tail_weights.append(paddle.randn([2, 1], dtype=paddle.float32)) >>> out, loss = F.adaptive_log_softmax_with_loss(input, paddle.full((3,), 1, dtype='int64'), head_weight, tail_weights, cutoffs=[2], head_bias=head_bias) >>> print(out) Tensor(shape=[3], dtype=float32, place=Place(cpu), stop_gradient=True, [-0.99842924, -2.27753878, -0.16740258]) >>> print(loss) Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True, 1.14779019)