RandomSampler

class paddle.io. RandomSampler ( data_source, replacement=False, num_samples=None, generator=None ) [source]

Iterate samples randomly, yield shuffled indices, if replacement=False, yield shuffled indices of the whole data source, if replacement=True, num_samples can set to specify the sample number to draw.

Parameters
  • data_source (Dataset) – dataset to sample, this could be an instance of Dataset or IterableDataset or other Python object which implemented __len__ to get indices as the range of dataset length. Default None.

  • replacement (bool, optional) – If False, sample the whole dataset, If True, set num_samples for how many samples to draw. Default False.

  • num_samples (int, optional) – set sample number to draw. Default None, which is set to the length of data_source.

  • generator (Generator, optional) – specify a generator to sample the data_source. Default None, disabled.

Returns

a Sampler yield sample index randomly.

Return type

RandomSampler

Examples

>>> import numpy as np
>>> from paddle.io import Dataset, RandomSampler

>>> np.random.seed(2023)
>>> class RandomDataset(Dataset):
...     def __init__(self, num_samples):
...         self.num_samples = num_samples
...
...     def __getitem__(self, idx):
...         image = np.random.random([784]).astype('float32')
...         label = np.random.randint(0, 9, (1, )).astype('int64')
...         return image, label
...
...     def __len__(self):
...         return self.num_samples
...
>>> sampler = RandomSampler(data_source=RandomDataset(100))

>>> for index in sampler:
...     print(index)
56
12
68
...
87