tf.keras.layers.TimeDistributed

This wrapper allows to apply a layer to every temporal slice of an input.

Inherits From: Wrapper

View aliases

Compat aliases for migration

tf.compat.v1.keras.layers.TimeDistributed, tf.compat.v2.keras.layers.TimeDistributed

tf.keras.layers.TimeDistributed(
    layer, **kwargs
)

The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension.

Consider a batch of 32 samples, where each sample is a sequence of 10 vectors of 16 dimensions. The batch input shape of the layer is then (32, 10, 16), and the input_shape, not including the samples dimension, is (10, 16).

You can then use TimeDistributed to apply a Dense layer to each of the 10 timesteps, independently:

# as the first layer in a model
model = Sequential()
model.add(TimeDistributed(Dense(8), input_shape=(10, 16)))
# now model.output_shape == (None, 10, 8)

The output will then have shape (32, 10, 8).

In subsequent layers, there is no need for the input_shape:

model.add(TimeDistributed(Dense(32)))
# now model.output_shape == (None, 10, 32)

The output will then have shape (32, 10, 32).

TimeDistributed can be used with arbitrary layers, not just Dense, for instance with a Conv2D layer:

model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3, 3)),
                          input_shape=(10, 299, 299, 3)))

Arguments:

layer: a layer instance.

Call arguments:

inputs: Input tensor.
training: Python boolean indicating whether the layer should behave in training mode or in inference mode. This argument is passed to the wrapped layer (only if the layer supports this argument).
mask: Binary tensor of shape (samples, timesteps) indicating whether a given timestep should be masked. This argument is passed to the wrapped layer (only if the layer supports this argument).

Raises:

ValueError: If not initialized with a Layer instance.