Class RecordInput
Defined in tensorflow/python/ops/data_flow_ops.py
.
RecordInput asynchronously reads and randomly yields TFRecords.
A RecordInput Op will continuously read a batch of records asynchronously into a buffer of some fixed capacity. It can also asynchronously yield random records from this buffer.
It will not start yielding until at least buffer_size / 2
elements have been
placed into the buffer so that sufficient randomization can take place.
The order the files are read will be shifted each epoch by shift_amount
so
that the data is presented in a different order every epoch.
__init__
__init__(
file_pattern,
batch_size=1,
buffer_size=1,
parallelism=1,
shift_ratio=0,
seed=0,
name=None,
batches=None,
compression_type=None
)
Constructs a RecordInput Op.
Args:
file_pattern
: File path to the dataset, possibly containing wildcards. All matching files will be iterated over each epoch.batch_size
: How many records to return at a time.buffer_size
: The maximum number of records the buffer will contain.parallelism
: How many reader threads to use for reading from files.shift_ratio
: What percentage of the total number files to move the start file forward by each epoch.seed
: Specify the random number seed used by generator that randomizes records.name
: Optional name for the operation.batches
: None by default, creating a single batch op. Otherwise specifies how many batches to create, which are returned as a list whenget_yield_op()
is called. An example use case is to split processing between devices on one computer.compression_type
: The type of compression for the file. Currently ZLIB and GZIP are supported. Defaults to none.
Raises:
ValueError
: If one of the arguments is invalid.
Methods
tf.contrib.framework.RecordInput.get_yield_op
get_yield_op()
Adds a node that yields a group of records every time it is executed.
If RecordInput batches
parameter is not None, it yields a list of
record batches with the specified batch_size
.