PyLayer¶
- class paddle.autograd. PyLayer [source]
-
Paddle implements Python custom operators on the PaddlePaddle framework by creating a subclass of
PyLayer
, which must comply with the following rules:1. The subclass must contain static
forward
andbackward
functions, with the first argument being PyLayerContext. If a returned value inbackward
corresponds to aTensor
that requires gradients inforward
, the returned value must be aTensor
.2. Except for the first argument, other arguments of
backward
are gradients of the outputTensors
offorward
. Therefore, the number of inputTensor
inbackward
must be the same as the number of outputTensor
inforward
. If you need to use inputTensor
fromforward
inbackward
, you can save theseTensors
by inputting them into PyLayerContext’ssave_for_backward
method and use them inbackward
later.3. The output of
backward
can beTensor
orlist/tuple(Tensor)
, which are gradients of the outputTensor
offorward
. Therefore, the number of outputTensor
inbackward
is the same as the number of inputTensor
inforward
.After building the custom operator, apply it by running the
apply
method.Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> class cus_tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... y = paddle.tanh(x) ... # Pass tensors to backward. ... ctx.save_for_backward(y) ... return y ... ... @staticmethod ... def backward(ctx, dy): ... # Get the tensors passed by forward. ... y, = ctx.saved_tensor() ... grad = dy * (1 - paddle.square(y)) ... return grad >>> paddle.seed(2023) >>> data = paddle.randn([2, 3], dtype="float64") >>> data.stop_gradient = False >>> z = cus_tanh.apply(data) >>> z.mean().backward() >>> print(data.grad) Tensor(shape=[2, 3], dtype=float64, place=Place(cpu), stop_gradient=True, [[0.16604150, 0.05858341, 0.14051214], [0.15677770, 0.01564609, 0.02991660]])
-
static
forward
(
ctx,
*args,
**kwargs
)
forward¶
-
It is to be overloaded by subclasses. It must accept a object of PyLayerContext as the first argument, followed by any number of arguments (tensors or other types). None can not be included in the returned result.
- Parameters
-
*args (tuple) – input of PyLayer.
**kwargs (dict) – input of PyLayer.
- Returns
-
output of PyLayer.
- Return type
-
tensors or other types
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> class cus_tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... y = paddle.tanh(x) ... # Pass tensors to backward. ... ctx.save_for_backward(y) ... return y ... ... @staticmethod ... def backward(ctx, dy): ... # Get the tensors passed by forward. ... y, = ctx.saved_tensor() ... grad = dy * (1 - paddle.square(y)) ... return grad
-
static
backward
(
ctx,
*args
)
[source]
backward¶
-
This is a function to calculate the gradient. It is to be overloaded by subclasses. It must accept a object of PyLayerContext as the first argument, and the rest arguments are the gradient of forward’s output tensors. Output tensors of backward are the gradient of forward’s input tensors.
- Parameters
-
*args (tuple) – The gradient of forward’s output tensor(s).
**kwargs (dict) – The gradient of forward’s output tensor(s).
- Returns
-
The gradient of forward’s input tensor(s).
- Return type
-
Tensor or list of Tensors
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> class cus_tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... y = paddle.tanh(x) ... # Pass tensors to backward. ... ctx.save_for_backward(y) ... return y ... ... @staticmethod ... def backward(ctx, dy): ... # Get the tensors passed by forward. ... y, = ctx.saved_tensor() ... grad = dy * (1 - paddle.square(y)) ... return grad
-
mark_non_differentiable
(
*args
)
mark_non_differentiable¶
-
Marks outputs as non-differentiable. This should be called at most once, only from inside the forward method, and all arguments should be tensor outputs.
This will mark outputs as not requiring gradients, increasing the efficiency of backward computation. You still need to accept a gradient for each output in backward, but it’s always going to be a zero tensor with the same shape as the shape of a corresponding output.
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> import numpy as np >>> class Tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... a = x + x ... b = x + x + x ... ctx.mark_non_differentiable(a) ... return a, b ... ... @staticmethod ... def backward(ctx, grad_a, grad_b): ... assert np.equal(grad_a.numpy(), paddle.zeros([1]).numpy()) ... assert np.equal(grad_b.numpy(), paddle.ones([1], dtype="float64").numpy()) ... return grad_b >>> x = paddle.ones([1], dtype="float64") >>> x.stop_gradient = False >>> a, b = Tanh.apply(x) >>> b.sum().backward()
-
mark_not_inplace
(
*args
)
mark_not_inplace¶
-
Marks inputs as not inplace. This should be called at most once, only from inside the forward method, and all arguments should be Tensor inputs.
If the Tensor returned by forward method is the same as the Tensor input of forward, and this Tensor is marked as not_inplace, then Paddle will help the user create a new Tensor as output. Thereby preventing the auto grad information of the input Tensor from being overwritten.
Examples
>>> import paddle >>> class Exp(paddle.autograd.PyLayer): ... @staticmethod ... def forward(ctx, x): ... ctx.mark_not_inplace(x) ... return x ... ... @staticmethod ... def backward(ctx, grad_output): ... out = grad_output.exp() ... return out >>> paddle.seed(2023) >>> x = paddle.randn((1, 1)) >>> x.stop_gradient = False >>> attn_layers = [] >>> for idx in range(0, 2): ... attn_layers.append(Exp()) >>> for step in range(0, 2): ... a = x ... for j in range(0,2): ... a = attn_layers[j].apply(x) ... a.backward()
-
save_for_backward
(
*tensors
)
save_for_backward¶
-
Saves given tensors that backward need. Use
saved_tensor
in the backward to get the saved tensors.Note
This API should be called at most once, and only inside forward.
- Parameters
-
tensors (list of Tensors) – Tensors to be stored.
- Returns
-
None
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> class cus_tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... # ctx is a context object that store some objects for backward. ... y = paddle.tanh(x) ... # Pass tensors to backward. ... ctx.save_for_backward(y) ... return y ... ... @staticmethod ... def backward(ctx, dy): ... # Get the tensors passed by forward. ... y, = ctx.saved_tensor() ... grad = dy * (1 - paddle.square(y)) ... return grad
-
saved_tensor
(
)
saved_tensor¶
-
Get the tensors stored by
save_for_backward
.- Returns
-
If context contains tensors stored by save_for_backward, then return these tensors, otherwise return None.
- Return type
-
list of Tensors or None
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> class cus_tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... # ctx is a context object that store some objects for backward. ... y = paddle.tanh(x) ... # Pass tensors to backward. ... ctx.save_for_backward(y) ... return y ... ... @staticmethod ... def backward(ctx, dy): ... # Get the tensors passed by forward. ... y, = ctx.saved_tensor() ... grad = dy * (1 - paddle.square(y)) ... return grad
-
set_materialize_grads
(
value: bool
)
set_materialize_grads¶
-
Sets whether to materialize output grad tensors. Default is True.
This should be called only from inside the forward method.
If True, undefined output grad tensors will be expanded to tensors full of zeros prior to calling the backward method.
If False, undefined output grad tensors will be None.
Examples
>>> import paddle >>> from paddle.autograd import PyLayer >>> import numpy as np >>> class Tanh(PyLayer): ... @staticmethod ... def forward(ctx, x): ... return x+x+x, x+x ... ... @staticmethod ... def backward(ctx, grad, grad2): ... assert np.equal(grad2.numpy(), paddle.zeros([1]).numpy()) ... return grad >>> class Tanh2(PyLayer): ... @staticmethod ... def forward(ctx, x): ... ctx.set_materialize_grads(False) ... return x+x+x, x+x ... ... @staticmethod ... def backward(ctx, grad, grad2): ... assert grad2==None ... return grad >>> x = paddle.ones([1], dtype="float64") >>> x.stop_gradient = False >>> Tanh.apply(x)[0].backward() >>> x2 = paddle.ones([1], dtype="float64") >>> x2.stop_gradient = False >>> Tanh2.apply(x2)[0].backward()
-
static
forward
(
ctx,
*args,
**kwargs
)