GLU

class paddle.nn. GLU ( axis=- 1, name=None ) [source]

GLU Activation.

\[GLU(a, b) = a \otimes \sigma(b) where :math:`a` is the first half of the input matrices and :math:`b` is the second half.\]
Parameters
  • axis (int, optional) – The axis along which split the input tensor. It should be in range [-D, D), where D is the dimensions of x . If axis < 0, it works the same way as \(axis + D\) . Default is -1.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Shape:
  • input: Tensor which the size of the given axis is even.

  • output: Tensor which the size of the given axis is halved.

Examples

>>> import paddle
>>> x = paddle.to_tensor(
...     [[-0.22014759, -1.76358426,  0.80566144,  0.04241343],
...         [-1.94900405, -1.89956081,  0.17134808, -1.11280477]]
... )
>>> m = paddle.nn.GLU()
>>> out = m(x)
>>> print(out)
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=True,
[[-0.15216254, -0.90048921],
[-1.05778778, -0.46985325]])
forward ( x )

forward

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

extra_repr ( )

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.