chainer.datasets.PickleDataset

class chainer.datasets.PickleDataset(reader)[source]

Dataset stored in a storage using pickle.

pickle is the default serialization library of Python. This dataset stores any objects in a storage using pickle. Even when a user wants to use a large dataset, this dataset can stores all data in a large storage like HDD and each data can be randomly accessible.

>>> with chainer.datasets.open_pickle_dataset_writer(path_to_data) as w:
...     w.write((1, 2.0, 'hello'))
...     w.write((2, 3.0, 'good-bye'))
...
>>> with chainer.datasets.open_pickle_dataset(path_to_data) as dataset:
...     print(dataset[1])
...
(2, 3.0, 'good-bye')
Parameters

reader – File like object. reader must support random access.

Methods

__enter__()[source]
__exit__(exc_type, exc_value, traceback)[source]
__getitem__(index)[source]

Returns an example or a sequence of examples.

It implements the standard Python indexing and one-dimensional integer array indexing. It uses the get_example() method by default, but it may be overridden by the implementation to, for example, improve the slicing performance.

Parameters

index (int, slice, list or numpy.ndarray) – An index of an example or indexes of examples.

Returns

If index is int, returns an example created by get_example. If index is either slice or one-dimensional list or numpy.ndarray, returns a list of examples created by get_example.

Example

>>> import numpy
>>> from chainer import dataset
>>> class SimpleDataset(dataset.DatasetMixin):
...     def __init__(self, values):
...         self.values = values
...     def __len__(self):
...         return len(self.values)
...     def get_example(self, i):
...         return self.values[i]
...
>>> ds = SimpleDataset([0, 1, 2, 3, 4, 5])
>>> ds[1]   # Access by int
1
>>> ds[1:3]  # Access by slice
[1, 2]
>>> ds[[4, 0]]  # Access by one-dimensional integer list
[4, 0]
>>> index = numpy.arange(3)
>>> ds[index]  # Access by one-dimensional integer numpy.ndarray
[0, 1, 2]
__len__()[source]

Returns the number of data points.

close()[source]

Closes a file reader.

After a user calls this method, the dataset will no longer be accessible..

get_example(index)[source]

Returns the i-th example.

Implementations should override it. It should raise IndexError if the index is invalid.

Parameters

i (int) – The index of the example.

Returns

The i-th example.