FusedFeedForward

class paddle.incubate.nn. FusedFeedForward ( d_model, dim_feedforward, dropout_rate=0.1, epsilon=1e-05, activation='relu', act_dropout_rate=None, normalize_before=False, weight_attr=None, bias_attr=None, name=None ) [source]
Parameters
  • d_model (int) – The expected feature size in the input and output.

  • dim_feedforward (int) – The hidden layer size.

  • dropout_rate (float, optional) – The dropout probability used in pre-process and post-precess. Default 0.1

  • epsilon (float, optional) – he small value added to the variance to prevent division by zero. Default: 1e-05.

  • activation (str, optional) – The activation function. Default relu.

  • act_dropout_rate (float, optional) – The dropout probability after activition. If None, use the value of dropout_rate. Default None

  • normalize_before (bool, optional) – Indicate whether to put layer normalization into, preprocessing or postprocessing. Default False

  • weight_attr (ParamAttr, optional) – The attribute for the learnable weight of this layer. The default value is None and the weight will be initialized to zero. For detailed information, please refer to paddle.ParamAttr.

  • bias_attr (ParamAttr|bool, optional) – The attribute for the learnable bias of thi layer. If it is set to False, no bias will be added to the output. If it is set to None or one kind of ParamAttr, a bias parameter will be created according to ParamAttr. For detailed information, please refer to paddle.ParamAttr. The default value is None and the bias will be initialized to zero.

Examples

# required: gpu
import paddle
from paddle.incubate.nn import FusedFeedForward

fused_feedforward_layer = FusedFeedForward(8, 8)
x = paddle.rand((1, 8, 8))
out = fused_feedforward_layer(x)
print(out.numpy().shape)
# (1, 8, 8)
forward ( src, cache=None )

forward

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

extra_repr ( )

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.