pca_lowrank

paddle.sparse. pca_lowrank ( x, q=None, center=True, niter=2, name=None ) [source]

Performs linear Principal Component Analysis (PCA) on a sparse matrix.

Let \(X\) be the input matrix or a batch of input matrices, the output should satisfies:

\[X = U * diag(S) * V^{T}\]
Parameters
  • x (Tensor) – The input tensor. Its shape should be [N, M], N and M can be arbitrary positive number. The data type of x should be float32 or float64.

  • q (int, optional) – a slightly overestimated rank of \(X\). Default value is \(q=min(6,N,M)\).

  • center (bool, optional) – if True, center the input tensor. Default value is True.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

  • Tensor U, is N x q matrix.

  • Tensor S, is a vector with length q.

  • Tensor V, is M x q matrix.

tuple (U, S, V): which is the nearly optimal approximation of a singular value decomposition of a centered matrix \(X\).

Examples

>>> 
>>> import paddle
>>> paddle.device.set_device('gpu')

>>> format = "coo"
>>> paddle.seed(2023)
>>> dense_x = paddle.randn((5, 5), dtype='float64')

>>> if format == "coo":
...     sparse_x = dense_x.to_sparse_coo(len(dense_x.shape))
>>> else:
...     sparse_x = dense_x.to_sparse_csr()

>>> print("sparse.pca_lowrank API only support CUDA 11.x")
>>> # U, S, V = None, None, None
>>> # use code blow when your device CUDA version >= 11.0
>>> U, S, V = paddle.sparse.pca_lowrank(sparse_x)

>>> print(U)
Tensor(shape=[5, 5], dtype=float64, place=Place(gpu:0), stop_gradient=True,
       [[-0.31412600,  0.44814876,  0.18390454, -0.19967630, -0.79170452],
        [-0.31412600,  0.44814876,  0.18390454, -0.58579808,  0.56877700],
        [-0.31412600,  0.44814876,  0.18390454,  0.78547437,  0.22292751],
        [-0.38082462,  0.10982129, -0.91810233,  0.00000000,  0.00000000],
        [ 0.74762770,  0.62082796, -0.23585052,  0.00000000, -0.00000000]])

>>> print(S)
Tensor(shape=[5], dtype=float64, place=Place(gpu:0), stop_gradient=True,
       [1.56031096, 1.12956227, 0.27922715, 0.00000000, 0.00000000])

>>> print(V)
Tensor(shape=[5, 5], dtype=float64, place=Place(gpu:0), stop_gradient=True,
       [[ 0.88568469, -0.29081908,  0.06163676,  0.19597228, -0.29796422],
        [-0.26169364, -0.27616183,  0.43148760, -0.42522796, -0.69874939],
        [ 0.28587685,  0.30695344, -0.47790836, -0.76982533, -0.05501437],
        [-0.23958121, -0.62770647, -0.71141770,  0.11463224, -0.17125926],
        [ 0.08918713, -0.59238761,  0.27478686, -0.41833534,  0.62498824]])