chainer.datasets.ZippedImageDataset

class chainer.datasets.ZippedImageDataset(zipfilename, dtype=None)[source]

Dataset of images built from a zip file.

This dataset reads an external image file in the given zipfile. The zipfile shall contain only image files. This shall be able to replace ImageDataset and works better on NFS and other networked file systems. If zipfile becomes too large you may consider MultiZippedImageDataset as a handy alternative.

Known issue: pickle and unpickle on same process may cause race condition on ZipFile. Pickle of this class is expected to be sent to different processess via ChainerMN.

Parameters
  • zipfilename (str) – a string to point zipfile path

  • dtype – Data type of resulting image arrays. chainer.config.dtype is used by default (see Configuring Chainer).

Methods

__getitem__(index)[source]

Returns an example or a sequence of examples.

It implements the standard Python indexing and one-dimensional integer array indexing. It uses the get_example() method by default, but it may be overridden by the implementation to, for example, improve the slicing performance.

Parameters

index (int, slice, list or numpy.ndarray) – An index of an example or indexes of examples.

Returns

If index is int, returns an example created by get_example. If index is either slice or one-dimensional list or numpy.ndarray, returns a list of examples created by get_example.

Example

>>> import numpy
>>> from chainer import dataset
>>> class SimpleDataset(dataset.DatasetMixin):
...     def __init__(self, values):
...         self.values = values
...     def __len__(self):
...         return len(self.values)
...     def get_example(self, i):
...         return self.values[i]
...
>>> ds = SimpleDataset([0, 1, 2, 3, 4, 5])
>>> ds[1]   # Access by int
1
>>> ds[1:3]  # Access by slice
[1, 2]
>>> ds[[4, 0]]  # Access by one-dimensional integer list
[4, 0]
>>> index = numpy.arange(3)
>>> ds[index]  # Access by one-dimensional integer numpy.ndarray
[0, 1, 2]
__len__()[source]

Returns the number of data points.

get_example(i_or_filename)[source]

Returns the i-th example.

Implementations should override it. It should raise IndexError if the index is invalid.

Parameters

i (int) – The index of the example.

Returns

The i-th example.