softmax

paddle.fluid.layers.nn. softmax ( input, use_cudnn=True, name=None, axis=- 1 ) [source]

This operator implements the softmax layer. The calculation process is as follows:

  1. The dimension axis of the input will be permuted to the last.

2. Then the input tensor will be logically flattened to a 2-D matrix. The matrix’s second dimension(row length) is the same as the dimension axis of the input tensor, and the first dimension(column length) is the product of all other dimensions of the input tensor. For each row of the matrix, the softmax operator squashes the K-dimensional(K is the width of the matrix, which is also the size of the input tensor’s dimension axis) vector of arbitrary real values to a K-dimensional vector of real values in the range [0, 1] that add up to 1.

3. After the softmax operation is completed, the inverse operations of steps 1 and 2 are performed to restore the two-dimensional matrix to the same dimension as the input.

It computes the exponential of the given dimension and the sum of exponential values of all the other dimensions in the K-dimensional vector input. Then the ratio of the exponential of the given dimension and the sum of exponential values of all the other dimensions is the output of the softmax operator.

For each row \(i\) and each column \(j\) in the matrix, we have:

\[\begin{split}Out[i, j] = \\frac{\\exp(X[i, j])}{\\sum_j(exp(X[i, j])}\end{split}\]

Example:

Case 1:
  Input:
    X.shape = [2, 3, 4]
    X.data = [[[2.0, 3.0, 4.0, 5.0],
               [3.0, 4.0, 5.0, 6.0],
               [7.0, 8.0, 8.0, 9.0]],
              [[1.0, 2.0, 3.0, 4.0],
               [5.0, 6.0, 7.0, 8.0],
               [6.0, 7.0, 8.0, 9.0]]]

  Attrs:
    axis = -1

  Output:
    Out.shape = [2, 3, 4]
    Out.data = [[[0.0320586 , 0.08714432, 0.23688282, 0.64391426],
                 [0.0320586 , 0.08714432, 0.23688282, 0.64391426],
                 [0.07232949, 0.19661193, 0.19661193, 0.53444665]],
                [[0.0320586 , 0.08714432, 0.23688282, 0.64391426],
                 [0.0320586 , 0.08714432, 0.23688282, 0.64391426],
                 [0.0320586 , 0.08714432, 0.23688282, 0.64391426]]]

Case 2:
  Input:
    X.shape = [2, 3, 4]
    X.data = [[[2.0, 3.0, 4.0, 5.0],
               [3.0, 4.0, 5.0, 6.0],
               [7.0, 8.0, 8.0, 9.0]],
              [[1.0, 2.0, 3.0, 4.0],
               [5.0, 6.0, 7.0, 8.0],
               [6.0, 7.0, 8.0, 9.0]]]
  Attrs:
    axis = 1

  Output:
    Out.shape = [2, 3, 4]
    Out.data = [[[0.00657326, 0.00657326, 0.01714783, 0.01714783],
                 [0.01786798, 0.01786798, 0.04661262, 0.04661262],
                 [0.97555875, 0.97555875, 0.93623955, 0.93623955]],
                [[0.00490169, 0.00490169, 0.00490169, 0.00490169],
                 [0.26762315, 0.26762315, 0.26762315, 0.26762315],
                 [0.72747516, 0.72747516, 0.72747516, 0.72747516]]]
Parameters
  • input (Tensor) – The input tensor. A multi-dimension Tensor with type float32 or float64.

  • use_cudnn (bool, optional) – Use cudnn kernel or not, it is valid only when the cudnn library is installed. To improve performance, set use_cudnn to True by default.

  • name (str, optional) – The default value is None. Normally there is no need for user to set this property. For more information, please refer to Name . Default: None. will be named automatically. Default: None.

  • axis (int, optional) – The index of dimension to perform softmax calculations, it should be in range \([-1, rank - 1]\), while \(rank\) is the rank of input tensor. Default: -1. -1 means the last dimension.

Returns

Tensor indicates the output of softmax. The data type and shape are the same as input .

Return type

Tensor

Examples

import paddle
import paddle.nn.functional as F

x = paddle.to_tensor([[[2.0, 3.0, 4.0, 5.0],
                    [3.0, 4.0, 5.0, 6.0],
                    [7.0, 8.0, 8.0, 9.0]],
                    [[1.0, 2.0, 3.0, 4.0],
                    [5.0, 6.0, 7.0, 8.0],
                    [6.0, 7.0, 8.0, 9.0]]], dtype='float32')
y = F.softmax(x, axis=1)
print(y)
# [[[0.00657326, 0.00657326, 0.01714783, 0.01714783],
#   [0.01786798, 0.01786798, 0.04661262, 0.04661262],
#   [0.97555870, 0.97555870, 0.93623954, 0.93623954]],
#  [[0.00490169, 0.00490169, 0.00490169, 0.00490169],
#   [0.26762316, 0.26762316, 0.26762316, 0.26762316],
#   [0.72747517, 0.72747517, 0.72747517, 0.72747517]]]