top_p_sampling

paddle.Tensor. top_p_sampling ( x: Tensor, ps: Tensor, threshold: Tensor | None = None, topp_seed: Tensor | None = None, seed: int = -1, k: int = 0, mode: Literal['truncated', 'non-truncated'] = 'truncated', return_top: bool = False, name: str | None = None ) → tuple[Tensor, Tensor]

Get the TopP scores and ids.

Parameters

x (Tensor) – An input 2-D Tensor with type float32, float16 and bfloat16.
ps (Tensor) – A 1-D Tensor with type float32, float16 and bfloat16, used to specify the top_p corresponding to each query.
threshold (Tensor|None, optional) – A 1-D Tensor with type float32, float16 and bfloat16, used to avoid sampling low score tokens.
topp_seed (Tensor|None, optional) – A 1-D Tensor with type int64, used to specify the random seed for each query.
seed (int, optional) – the random seed. Default is -1,
k (int) – the number of top_k scores/ids to be returned. Default is 0.
mode (str) – The mode to choose sampling strategy. If the mode is truncated, sampling will truncate the probability at top_p_value. If the mode is non-truncated, it will not be truncated. Default is truncated.
return_top (bool) – Whether to return the top_k scores and ids. Default is False.
name (str|None, optional) – For details, please refer to Name. Generally, no setting is required. Default: None.

Returns

tuple(Tensor), return the values and indices. The value data type is the same as the input x. The indices data type is int64.

Examples

>>> 
>>> import paddle

>>> paddle.device.set_device('gpu')
>>> paddle.seed(2023)
>>> x = paddle.randn([2,3])
>>> print(x)
Tensor(shape=[2, 3], dtype=float32, place=Place(gpu:0), stop_gradient=True,
 [[-0.32012719, -0.07942779,  0.26011357],
  [ 0.79003978, -0.39958701,  1.42184138]])
>>> paddle.seed(2023)
>>> ps = paddle.randn([2])
>>> print(ps)
Tensor(shape=[2], dtype=float32, place=Place(gpu:0), stop_gradient=True,
 [-0.32012719, -0.07942779])
>>> value, index = paddle.tensor.top_p_sampling(x, ps)
>>> print(value)
Tensor(shape=[2, 1], dtype=float32, place=Place(gpu:0), stop_gradient=True,
 [[0.26011357],
  [1.42184138]])
>>> print(index)
Tensor(shape=[2, 1], dtype=int64, place=Place(gpu:0), stop_gradient=True,
 [[2],
  [2]])