Dataset¶
- class paddle.io. Dataset [source]
-
An abstract class to encapsulate methods and behaviors of datasets.
All datasets in map-style(dataset samples can be get by a given key) should be a subclass of paddle.io.Dataset. All subclasses should implement following methods:
__getitem__
: get sample from dataset with a given index. This method is required by reading dataset sample inpaddle.io.DataLoader
.__len__
: return dataset sample number. This method is required by some implements ofpaddle.io.BatchSampler
see
paddle.io.DataLoader
.Examples
import numpy as np from paddle.io import Dataset # define a random dataset class RandomDataset(Dataset): def __init__(self, num_samples): self.num_samples = num_samples def __getitem__(self, idx): image = np.random.random([784]).astype('float32') label = np.random.randint(0, 9, (1, )).astype('int64') return image, label def __len__(self): return self.num_samples dataset = RandomDataset(10) for i in range(len(dataset)): print(dataset[i])