hessian¶
- paddle.autograd. hessian ( ys: paddle.Tensor, xs: Union[paddle.Tensor, Tuple[paddle.Tensor, ...]], batch_axis: Optional[int] = None ) Union[Tuple[Tuple[paddle.autograd.autograd.Hessian, ...], ...], paddle.autograd.autograd.Hessian] [source]
-
Computes the Jacobian of the dependent variable
ys
versus the independent variablexs
.Among them,
ys
means the output ofxs
after a certain operation,ys
can only be a single Tensor,xs
can be a Tensor or a Tensor tuple, andbatch_axis
means The position of the batch dimension of the parameter data.When the input
xs
is a Tensor tuple, the returned result is aHessian
tuple, assuming that the internal shape of thexs
tuple is composed of([M1, ], [M2, ])
, the shape of the returned result consists of(([M1, M1], [M1, M2]), ([M2, M1], [M2, M2]))
When
batch_axis=None
, only 0-dimensional Tensor or 1-dimensional Tensor is supported, assuming that the shape ofxs
is[N, ]
, and the shape ofys
is[ ]
(0-dimensional Tensor), the final output is a single Hessian matrix whose shape is[N, N]
.When
batch_axis=0
, only 1-dimensional Tensor or 2-dimensional Tensor is supported, assuming that the shape ofxs
is[B, N]
, and the shape ofys
is[B, ]
, the final output Jacobian matrix shape is[B, N, N]
.
After the
Hessian
object is created, the complete calculation process does not occur, but a partial lazy evaluation method is used for calculation. It can be multi-dimensionally indexed to obtain the entire Hessian matrix or sub-matrix. At this time, the actual Evaluates the computation and returns the result. At the same time, in the actual evaluation process, the calculated sub-matrix will be cached to avoid repeated calculations in the subsequent indexing process.- Parameters
-
ys (paddle.Tensor) – Output derived from xs which contain one element.
xs (Union[paddle.Tensor, Tuple[paddle.Tensor, ...]]) – Input or tuple of inputs.
batch_axis (Optional[int], optional) – Index of batch axis. Defaults to None.
- Returns
-
Hessian(s) of ys derived from xs.
- Return type
-
Union[Tuple[Tuple[Hessian, …], …], Tuple[Hessian, …], Hessian]
Examples
>>> import paddle >>> x1 = paddle.randn([3, ]) >>> x2 = paddle.randn([4, ]) >>> x1.stop_gradient = False >>> x2.stop_gradient = False >>> y = x1.sum() + x2.sum() >>> H = paddle.autograd.hessian(y, (x1, x2)) >>> H_y_x1_x1 = H[0][0][:] # evaluate result of ddy/dx1x1 >>> H_y_x1_x2 = H[0][1][:] # evaluate result of ddy/dx1x2 >>> H_y_x2_x1 = H[1][0][:] # evaluate result of ddy/dx2x1 >>> H_y_x2_x2 = H[1][1][:] # evaluate result of ddy/dx2x2 >>> print(H_y_x1_x1.shape) [3, 3] >>> print(H_y_x1_x2.shape) [3, 4] >>> print(H_y_x2_x1.shape) [4, 3] >>> print(H_y_x2_x2.shape) [4, 4]