FusedBiasDropoutResidualLayerNorm

class paddle.incubate.nn. FusedBiasDropoutResidualLayerNorm ( embed_dim: int, dropout_rate: float = 0.5, weight_attr: ParamAttrLike | None = None, bias_attr: ParamAttrLike | None = None, epsilon: float = 1e-05, name: str | None = None ) [source]

Applies fused_bias_dropout_residual_layer_norm operation.

Parameters
  • embed_dim (int) – The expected feature size in the input and output.

  • dropout_rate (float, optional) – The dropout probability used on attention weights to drop some attention targets for the dropout after attention. 0 for no dropout. Default 0.5.

  • weight_attr (ParamAttr|None, optional) – The attribute for the learnable weight of this layer. The default value is None and the weight will be initialized to zero. For detailed information, please refer to paddle.ParamAttr.

  • bias_attr (ParamAttr|bool|None, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. If it is set to False, this layer will not have trainable bias parameter. See usage for details in ParamAttr.

  • epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05.

  • name (str|None, optional) – Normally there is no need for user to set this parameter. For detailed information, please refer to Name .

Examples

>>> 
>>> import paddle
>>> paddle.device.set_device('gpu')
>>> # input: [batch_size, seq_len, embed_dim]
>>> x = paddle.rand((2, 4, 128))
>>> # residual: [batch_size, seq_len, embed_dim]
>>> residual = paddle.rand((2, 4, 128))
>>> fused_bias_dropout_residual_ln = paddle.incubate.nn.FusedBiasDropoutResidualLayerNorm(128)
>>> output = fused_bias_dropout_residual_ln(x, residual)
>>> print(output.shape)
[2, 4, 128]
forward ( x: Tensor, residual: Tensor ) Tensor

forward

Applies fused_bias_dropout_residual_layer_norm operation.

Parameters
  • x (Tensor) – The input tensor. It is a tensor with shape [batch_size, seq_len, embed_dim]. The data type should be float32 or float64.

  • residual (Tensor, optional) – The residual tensor. It is a tensor with shape [batch_size, value_length, vdim]. The data type should be float32 or float64.

Returns

It is a tensor that has the same shape and data type as x.

Return type

Tensor|tuple

extra_repr ( )

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.