tf.contrib.timeseries.CSVReader

Class CSVReader

Defined in tensorflow/contrib/timeseries/python/timeseries/input_pipeline.py.

Reads from a collection of CSV-formatted files.

__init__

__init__(
    filenames,
    column_names=(feature_keys.TrainEvalFeatures.TIMES, feature_keys.TrainEvalFeatures.VALUES),
    column_dtypes=None,
    skip_header_lines=None,
    read_num_records_hint=4096
)

CSV-parsing reader for a TimeSeriesInputFn.

Args:

  • filenames: A filename or list of filenames to read the time series from. Each line must have columns corresponding to column_names.
  • column_names: A list indicating names for each feature. TrainEvalFeatures.TIMES and TrainEvalFeatures.VALUES are required; VALUES may be repeated to indicate a multivariate series.
  • column_dtypes: If provided, must be a list with the same length as column_names, indicating dtypes for each column. Defaults to tf.int64 for TrainEvalFeatures.TIMES and tf.float32 for everything else.
  • skip_header_lines: Passed on to tf.TextLineReader; skips this number of lines at the beginning of each file.
  • read_num_records_hint: When not reading a full dataset, indicates the number of records to parse/transfer in a single chunk (for efficiency). The actual number transferred at one time may be more or less.

Raises:

  • ValueError: If required column names are not specified, or if lengths do not match.

Methods

tf.contrib.timeseries.CSVReader.check_dataset_size

check_dataset_size(minimum_dataset_size)

When possible, raises an error if the dataset is too small.

This method allows TimeSeriesReaders to raise informative error messages if the user has selected a window size in their TimeSeriesInputFn which is larger than the dataset size. However, many TimeSeriesReaders will not have access to a dataset size, in which case they do not need to override this method.

Args:

  • minimum_dataset_size: The minimum number of records which should be contained in the dataset. Readers should attempt to raise an error when possible if an epoch of data contains fewer records.

tf.contrib.timeseries.CSVReader.read

read()

Reads a chunk of data from the tf.ReaderBase for later re-chunking.

tf.contrib.timeseries.CSVReader.read_full

read_full()

Reads a full epoch of data into memory.