tf.data.experimental.parallel_interleave

View source on GitHub

A parallel version of the Dataset.interleave() transformation. (deprecated)

tf.data.experimental.parallel_interleave(
    map_func, cycle_length, block_length=1, sloppy=False,
    buffer_output_elements=None, prefetch_input_elements=None
)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_deterministic.

parallel_interleave() maps map_func across its input to produce nested datasets, and outputs their elements interleaved. Unlike tf.data.Dataset.interleave, it gets elements from cycle_length nested datasets in parallel, which increases the throughput, especially in the presence of stragglers. Furthermore, the sloppy argument can be used to improve performance, by relaxing the requirement that the outputs are produced in a deterministic order, and allowing the implementation to skip over nested datasets whose elements are not readily available when requested.

Example usage:

# Preprocess 4 files concurrently.
filenames = tf.data.Dataset.list_files("/path/to/data/train*.tfrecords")
dataset = filenames.apply(
    tf.data.experimental.parallel_interleave(
        lambda filename: tf.data.TFRecordDataset(filename),
        cycle_length=4))

WARNING: If sloppy is True, the order of produced elements is not deterministic.

Args:

Returns:

A Dataset transformation function, which can be passed to tf.data.Dataset.apply.