hessian

paddle.autograd. hessian ( ys: Tensor, xs: Tensor, batch_axis: int | None = None ) → Hessian [source]

paddle.autograd. hessian ( ys: Tensor, xs: Sequence[Tensor], batch_axis: int | None = None ) → tuple[tuple[paddle.autograd.autograd.Hessian, ...], ...]

Computes the Jacobian of the dependent variable ys versus the independent variable xs.

Among them, ys means the output of xs after a certain operation, ys can only be a single Tensor, xs can be a Tensor or a Tensor tuple, and batch_axis means The position of the batch dimension of the parameter data.

When the input xs is a Tensor tuple, the returned result is a Hessian tuple, assuming that the internal shape of the xs tuple is composed of ([M1, ], [M2, ]), the shape of the returned result consists of (([M1, M1], [M1, M2]), ([M2, M1], [M2, M2]))

When batch_axis=None, only 0-dimensional Tensor or 1-dimensional Tensor is supported, assuming that the shape of xs is [N, ], and the shape of ys is [ ] (0-dimensional Tensor), the final output is a single Hessian matrix whose shape is [N, N].
When batch_axis=0, only 1-dimensional Tensor or 2-dimensional Tensor is supported, assuming that the shape of xs is [B, N], and the shape of ys is [B, ], the final output Jacobian matrix shape is [B, N, N].

After the Hessian object is created, the complete calculation process does not occur, but a partial lazy evaluation method is used for calculation. It can be multi-dimensionally indexed to obtain the entire Hessian matrix or sub-matrix. At this time, the actual Evaluates the computation and returns the result. At the same time, in the actual evaluation process, the calculated sub-matrix will be cached to avoid repeated calculations in the subsequent indexing process.

Parameters

ys (paddle.Tensor) – Output derived from xs which contain one element.
xs (Union[paddle.Tensor, Tuple[paddle.Tensor, ...]]) – Input or tuple of inputs.
batch_axis (Optional[int], optional) – Index of batch axis. Defaults to None.

Returns

Hessian(s) of ys derived from xs.

Return type

Union[Tuple[Tuple[Hessian, …], …], Tuple[Hessian, …], Hessian]

Examples

>>> import paddle

>>> x1 = paddle.randn([3, ])
>>> x2 = paddle.randn([4, ])
>>> x1.stop_gradient = False
>>> x2.stop_gradient = False

>>> y = x1.sum() + x2.sum()

>>> H = paddle.autograd.hessian(y, (x1, x2))
>>> H_y_x1_x1 = H[0][0][:] # evaluate result of ddy/dx1x1
>>> H_y_x1_x2 = H[0][1][:] # evaluate result of ddy/dx1x2
>>> H_y_x2_x1 = H[1][0][:] # evaluate result of ddy/dx2x1
>>> H_y_x2_x2 = H[1][1][:] # evaluate result of ddy/dx2x2

>>> print(H_y_x1_x1.shape)
[3, 3]
>>> print(H_y_x1_x2.shape)
[3, 4]
>>> print(H_y_x2_x1.shape)
[4, 3]
>>> print(H_y_x2_x2.shape)
[4, 4]