fused_layer_norm

paddle.incubate.nn.functional. fused_layer_norm ( x: Tensor, norm_weight: Tensor, norm_bias: Tensor, epsilon: float, residual_alpha: float = 1.0, begin_norm_axis: int = 1, bias: Tensor | None = None, residual: None = None, quant_scale: float = - 1, quant_round_type: float = 0, quant_max_bound: float = 0, quant_min_bound: float = 0 ) Tensor [source]
paddle.incubate.nn.functional. fused_layer_norm ( x: Tensor, norm_weight: Tensor, norm_bias: Tensor, epsilon: float, residual_alpha: float = 1.0, begin_norm_axis: int = 1, bias: Tensor | None = None, residual: Tensor = None, quant_scale: float = - 1, quant_round_type: float = 0, quant_max_bound: float = 0, quant_min_bound: float = 0 ) tuple[Tensor, Tensor]

Apply Fused LayerNorm kernel. Also support LayerNorm(bias + residual_alpha * residual + x) fused pattern.

when norm_weight and norm_bias is None, it return fused (bias + residual_alpha * residual + x)

Parameters
  • x (Tensor) – the input Tensor..

  • norm_weight (Tensor) – the weight Tensor to affine output.

  • norm_bias (Tensor) – the bias Tensor to affine output.

  • epsilon (float) – a small float number to avoid divide 0.

  • residual_alpha (float) – a scale factor for residual. default is 1.

  • begin_norm_axis (int) – the begin axis to normalize. default is 1.

  • bias (optional|Tensor) – the previous layers’s bias to fused.

  • residual (optional|Tensor) – the residual input to fused.

  • quant_scale (float) – the quant scale.

  • quant_round_type (float) – the quant round type.

  • quant_max_bound (float) – the quant max bound to clip.

  • quant_min_bound (float) – the quant min bound to clip.

Returns

the output Tensor.

Return type

Tensor

Examples

>>> 
>>> import paddle
>>> paddle.device.set_device('gpu')

>>> paddle_x = paddle.cast(paddle.randn(shape=[32, 256]), dtype=paddle.float16)
>>> paddle_weight = paddle.cast(paddle.randn(shape=[256]), dtype=paddle.float32)
>>> paddle_bias = paddle.cast(paddle.randn(shape=[256]), dtype=paddle.float32)
>>> epsilon = 1e-6
>>> paddle_layernorm = paddle.incubate.nn.functional.fused_layer_norm(paddle_x, paddle_weight, paddle_bias, epsilon, 1)