softmax_mask_fuse_upper_triangle¶
- paddle.incubate. softmax_mask_fuse_upper_triangle ( x ) [source]
-
Do a masked softmax on x, which will always mask upper triangle part of x.
This is designed for speeding up GPT kind Transformer structure. Used for reducing operation such as: tmp = x + mask, out = softmax(tmp), where the mask is always be an upper triangle matrix. The equation is:
\[out = softmax(LowerTriangular(x))\]Note
This API only supports GPU.
- Parameters
-
x (4-D Tensor) – The input tensor, should be in 4D shape, it’s data type should be float16, float32 The fourth dimension of x must be larger or equal to 32 and less then 8192. The third dimension of x must be same with the fourth dimension of x.
- Returns
-
4-D Tensor. A location into which the result is stored. It’s dimension is 4D. Has same dimension with x.
Examples
>>> >>> import paddle >>> import paddle.incubate as incubate >>> paddle.seed(1) >>> paddle.set_device("gpu") >>> x = paddle.rand((1, 1, 32, 32)) >>> rst = incubate.softmax_mask_fuse_upper_triangle(x) >>> print(rst) Tensor(shape=[1, 1, 32, 32], dtype=float32, place=Place(gpu:0), stop_gradient=True, [[[[1. , 0. , 0. , ..., 0. , 0. , 0. ], [0.49575609, 0.50424391, 0. , ..., 0. , 0. , 0. ], [0.26035303, 0.25114325, 0.48850375, ..., 0. , 0. , 0. ], ..., [0.04379999, 0.04194880, 0.05150032, ..., 0.02721255, 0. , 0. ], [0.02348574, 0.01959674, 0.02609110, ..., 0.04046615, 0.02248267, 0. ], [0.02280738, 0.03144657, 0.02892209, ..., 0.03885521, 0.03342311, 0.02842640]]]])