Class CSVReader
Defined in tensorflow/contrib/timeseries/python/timeseries/input_pipeline.py
.
Reads from a collection of CSV-formatted files.
__init__
__init__(
filenames,
column_names=(feature_keys.TrainEvalFeatures.TIMES, feature_keys.TrainEvalFeatures.VALUES),
column_dtypes=None,
skip_header_lines=None,
read_num_records_hint=4096
)
CSV-parsing reader for a TimeSeriesInputFn
.
Args:
filenames
: A filename or list of filenames to read the time series from. Each line must have columns corresponding tocolumn_names
.column_names
: A list indicating names for each feature.TrainEvalFeatures.TIMES
andTrainEvalFeatures.VALUES
are required;VALUES
may be repeated to indicate a multivariate series.column_dtypes
: If provided, must be a list with the same length ascolumn_names
, indicating dtypes for each column. Defaults totf.int64
forTrainEvalFeatures.TIMES
andtf.float32
for everything else.skip_header_lines
: Passed on totf.TextLineReader
; skips this number of lines at the beginning of each file.read_num_records_hint
: When not reading a full dataset, indicates the number of records to parse/transfer in a single chunk (for efficiency). The actual number transferred at one time may be more or less.
Raises:
ValueError
: If required column names are not specified, or if lengths do not match.
Methods
tf.contrib.timeseries.CSVReader.check_dataset_size
check_dataset_size(minimum_dataset_size)
When possible, raises an error if the dataset is too small.
This method allows TimeSeriesReaders to raise informative error messages if the user has selected a window size in their TimeSeriesInputFn which is larger than the dataset size. However, many TimeSeriesReaders will not have access to a dataset size, in which case they do not need to override this method.
Args:
minimum_dataset_size
: The minimum number of records which should be contained in the dataset. Readers should attempt to raise an error when possible if an epoch of data contains fewer records.
tf.contrib.timeseries.CSVReader.read
read()
Reads a chunk of data from the tf.ReaderBase
for later re-chunking.
tf.contrib.timeseries.CSVReader.read_full
read_full()
Reads a full epoch of data into memory.