chainer.datasets.LabeledImageDataset¶
-
class
chainer.datasets.
LabeledImageDataset
(pairs, root='.', dtype=None, label_dtype=<class 'numpy.int32'>)[source]¶ Dataset of image and label pairs built from a list of paths and labels.
This dataset reads an external image file like
ImageDataset
. The difference fromImageDataset
is that this dataset also returns a label integer. The paths and labels are given as either a list of pairs or a text file contains paths/labels pairs in distinct lines. In the latter case, each path and corresponding label are separated by white spaces. This format is same as one used in Caffe.Note
This dataset requires the Pillow package being installed. In order to use this dataset, install Pillow (e.g. by using the command
pip install Pillow
). Be careful to prepare appropriate libraries for image formats you want to use (e.g. libpng for PNG images, and libjpeg for JPG images).Warning
You are responsible for preprocessing the images before feeding them to a model. For example, if your dataset contains both RGB and grayscale images, make sure that you convert them to the same format. Otherwise you will get errors because the input dimensions are different for RGB and grayscale images.
- Parameters
pairs (str or list of tuples) – If it is a string, it is a path to a text file that contains paths to images in distinct lines. If it is a list of pairs, the
i
-th element represents a pair of the path to thei
-th image and the corresponding label. In both cases, each path is a relative one from the root path given by another argument.root (str) – Root directory to retrieve images from.
dtype – Data type of resulting image arrays.
chainer.config.dtype
is used by default (see Configuring Chainer).label_dtype – Data type of the labels.
Methods
-
__getitem__
(index)[source]¶ Returns an example or a sequence of examples.
It implements the standard Python indexing and one-dimensional integer array indexing. It uses the
get_example()
method by default, but it may be overridden by the implementation to, for example, improve the slicing performance.- Parameters
index (int, slice, list or numpy.ndarray) – An index of an example or indexes of examples.
- Returns
If index is int, returns an example created by get_example. If index is either slice or one-dimensional list or numpy.ndarray, returns a list of examples created by get_example.
Example
>>> import numpy >>> from chainer import dataset >>> class SimpleDataset(dataset.DatasetMixin): ... def __init__(self, values): ... self.values = values ... def __len__(self): ... return len(self.values) ... def get_example(self, i): ... return self.values[i] ... >>> ds = SimpleDataset([0, 1, 2, 3, 4, 5]) >>> ds[1] # Access by int 1 >>> ds[1:3] # Access by slice [1, 2] >>> ds[[4, 0]] # Access by one-dimensional integer list [4, 0] >>> index = numpy.arange(3) >>> ds[index] # Access by one-dimensional integer numpy.ndarray [0, 1, 2]
-
get_example
(i)[source]¶ Returns the i-th example.
Implementations should override it. It should raise
IndexError
if the index is invalid.- Parameters
i (int) – The index of the example.
- Returns
The i-th example.