softmax_mask_fuse¶

paddle.incubate. softmax_mask_fuse ( x, mask, name=None ) [源代码] ¶

该 op 是对输入 x 进行被输入 mask mask 后的 softmax 操作。该 op 主要针对加速 Transformer 架构而设计。将 tmp = x + mask, rst = softmax(tmp) 两个操作合为一个操作。计算公式为：

\[out = softmax(x + mask)\]

注解

该 API 只可在 GPU 上运行

参数¶

x (4-D Tensor) - 输入的 Tensor，必须为 4D 的 shape，数据类型为：float16、float32。x 的第四维必须大于等于 32，并且小于 8192。

mask (4-D Tensor) - 输入的 Tensor，必须为 4D 的 shape，数据类型为：float16、float32。mask 的第二维必须为 1，其余维度必须与 x 的对应维度相同。

name (str，可选) - 具体用法请参见 Name，一般无需设置，默认值为 None。

返回¶

Tensor，维度和数据类型都与 x 相同，存储运算后的结果

代码示例¶

          # required: gpu
import paddle
import paddle.incubate as incubate

x = paddle.rand([2, 8, 8, 32])
mask = paddle.rand([2, 1, 8, 32])

rst = incubate.softmax_mask_fuse(x, mask)
# [[[[0.02404429, 0.04658398, 0.02746007, ..., 0.01489375, 0.02397441, 0.02851614] ... ]]]