GLU

class paddle.nn. GLU ( axis: int = -1, name: str | None = None ) [source]

GLU Activation.

\[GLU(a, b) = a \otimes \sigma(b) where :math:`a` is the first half of the input matrices and :math:`b` is the second half.\]

Parameters :

axis (int, optional) – The axis along which split the input tensor. It should be in range [-D, D), where D is the dimensions of x . If axis < 0, it works the same way as \(axis + D\) . Default is -1.
name (str|None, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Shape:

input: Tensor which the size of the given axis is even.
output: Tensor which the size of the given axis is halved.

Examples

>>> import paddle
>>> x = paddle.to_tensor(
...     [[-0.22014759, -1.76358426,  0.80566144,  0.04241343],
...         [-1.94900405, -1.89956081,  0.17134808, -1.11280477]]
... )
>>> m = paddle.nn.GLU()
>>> out = m(x)
>>> print(out)
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=True,
[[-0.15216254, -0.90048921],
[-1.05778778, -0.46985325]])

forward ( x: Tensor ) → Tensor forward¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters :

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

extra_repr ( ) → str extra_repr¶: Extra representation of this layer, you can have custom implementation of your own layer.

GLU

forward¶

extra_repr¶