Class AttentionCellWrapper
Inherits From: RNNCell
Defined in tensorflow/contrib/rnn/python/ops/rnn_cell.py
.
Basic attention cell wrapper.
Implementation based on https://arxiv.org/abs/1601.06733.
__init__
__init__(
cell,
attn_length,
attn_size=None,
attn_vec_size=None,
input_size=None,
state_is_tuple=True,
reuse=None
)
Create a cell with attention.
Args:
cell
: an RNNCell, an attention is added to it.attn_length
: integer, the size of an attention window.attn_size
: integer, the size of an attention vector. Equal to cell.output_size by default.attn_vec_size
: integer, the number of convolutional features calculated on attention state and a size of the hidden layer built from base cell state. Equal attn_size to by default.input_size
: integer, the size of a hidden linear layer, built from inputs and attention. Derived from the input tensor by default.state_is_tuple
: If True, accepted and returned states are n-tuples, wheren = len(cells)
. By default (False), the states are all concatenated along the column axis.reuse
: (optional) Python boolean describing whether to reuse variables in an existing scope. If notTrue
, and the existing scope already has the given variables, an error is raised.
Raises:
TypeError
: if cell is not an RNNCell.ValueError
: if cell returns a state tuple but the flagstate_is_tuple
isFalse
or if attn_length is zero or less.
Properties
activity_regularizer
Optional regularizer function for the output of this layer.
dtype
graph
input
Retrieves the input tensor(s) of a layer.
Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer.
Returns:
Input tensor or list of input tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.
Raises:
RuntimeError
: If called in Eager mode.AttributeError
: If no inbound nodes are found.
input_mask
Retrieves the input mask tensor(s) of a layer.
Only applicable if the layer has exactly one inbound node, i.e. if it is connected to one incoming layer.
Returns:
Input mask tensor (potentially None) or list of input mask tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.
input_shape
Retrieves the input shape(s) of a layer.
Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer, or if all inputs have the same shape.
Returns:
Input shape, as an integer shape tuple (or list of shape tuples, one tuple per input tensor).
Raises:
AttributeError
: if the layer has no defined input_shape.RuntimeError
: if called in Eager mode.
losses
Losses which are associated with this Layer
.
Variable regularization tensors are created when this property is accessed,
so it is eager safe: accessing losses
under a tf.GradientTape
will
propagate gradients back to the corresponding variables.
Returns:
A list of tensors.
name
non_trainable_variables
non_trainable_weights
output
Retrieves the output tensor(s) of a layer.
Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.
Returns:
Output tensor or list of output tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.RuntimeError
: if called in Eager mode.
output_mask
Retrieves the output mask tensor(s) of a layer.
Only applicable if the layer has exactly one inbound node, i.e. if it is connected to one incoming layer.
Returns:
Output mask tensor (potentially None) or list of output mask tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.
output_shape
Retrieves the output shape(s) of a layer.
Only applicable if the layer has one output, or if all outputs have the same shape.
Returns:
Output shape, as an integer shape tuple (or list of shape tuples, one tuple per output tensor).
Raises:
AttributeError
: if the layer has no defined output shape.RuntimeError
: if called in Eager mode.
output_size
Integer or TensorShape: size of outputs produced by this cell.
scope_name
state_size
size(s) of state(s) used by this cell.
It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.
trainable_variables
trainable_weights
updates
variables
Returns the list of all layer variables/weights.
Alias of self.weights
.
Returns:
A list of variables.
weights
Returns the list of all layer variables/weights.
Returns:
A list of variables.
Methods
tf.contrib.rnn.AttentionCellWrapper.__call__
__call__(
inputs,
state,
scope=None
)
Run this RNN cell on inputs, starting from the given state.
Args:
inputs
:2-D
tensor with shape[batch_size, input_size]
.state
: ifself.state_size
is an integer, this should be a2-D Tensor
with shape[batch_size, self.state_size]
. Otherwise, ifself.state_size
is a tuple of integers, this should be a tuple with shapes[batch_size, s] for s in self.state_size
.scope
: VariableScope for the created subgraph; defaults to class name.
Returns:
A pair containing:
- Output: A
2-D
tensor with shape[batch_size, self.output_size]
. - New state: Either a single
2-D
tensor, or a tuple of tensors matching the arity and shapes ofstate
.
tf.contrib.rnn.AttentionCellWrapper.__deepcopy__
__deepcopy__(memo)
tf.contrib.rnn.AttentionCellWrapper.__setattr__
__setattr__(
name,
value
)
Implement setattr(self, name, value).
tf.contrib.rnn.AttentionCellWrapper.apply
apply(
inputs,
*args,
**kwargs
)
Apply the layer on a input.
This is an alias of self.__call__
.
Arguments:
inputs
: Input tensor(s).*args
: additional positional arguments to be passed toself.call
.**kwargs
: additional keyword arguments to be passed toself.call
.
Returns:
Output tensor(s).
tf.contrib.rnn.AttentionCellWrapper.build
build(_)
Creates the variables of the layer (optional, for subclass implementers).
This is a method that implementers of subclasses of Layer
or Model
can override if they need a state-creation step in-between
layer instantiation and layer call.
This is typically used to create the weights of Layer
subclasses.
Arguments:
input_shape
: Instance ofTensorShape
, or list of instances ofTensorShape
if the layer expects a list of inputs (one instance per input).
tf.contrib.rnn.AttentionCellWrapper.compute_mask
compute_mask(
inputs,
mask=None
)
Computes an output mask tensor.
Arguments:
inputs
: Tensor or list of tensors.mask
: Tensor or list of tensors.
Returns:
None or a tensor (or list of tensors, one per output tensor of the layer).
tf.contrib.rnn.AttentionCellWrapper.compute_output_shape
compute_output_shape(input_shape)
Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
Arguments:
input_shape
: Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:
An input shape tuple.
tf.contrib.rnn.AttentionCellWrapper.count_params
count_params()
Count the total number of scalars composing the weights.
Returns:
An integer count.
Raises:
ValueError
: if the layer isn't yet built (in which case its weights aren't yet defined).
tf.contrib.rnn.AttentionCellWrapper.from_config
from_config(
cls,
config
)
Creates a layer from its config.
This method is the reverse of get_config
,
capable of instantiating the same layer from the config
dictionary. It does not handle layer connectivity
(handled by Network), nor weights (handled by set_weights
).
Arguments:
config
: A Python dictionary, typically the output of get_config.
Returns:
A layer instance.
tf.contrib.rnn.AttentionCellWrapper.get_config
get_config()
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity
information, nor the layer class name. These are handled
by Network
(one layer of abstraction above).
Returns:
Python dictionary.
tf.contrib.rnn.AttentionCellWrapper.get_initial_state
get_initial_state(
inputs=None,
batch_size=None,
dtype=None
)
tf.contrib.rnn.AttentionCellWrapper.get_input_at
get_input_at(node_index)
Retrieves the input tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A tensor (or list of tensors if the layer has multiple inputs).
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_input_mask_at
get_input_mask_at(node_index)
Retrieves the input mask tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A mask tensor (or list of tensors if the layer has multiple inputs).
tf.contrib.rnn.AttentionCellWrapper.get_input_shape_at
get_input_shape_at(node_index)
Retrieves the input shape(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A shape tuple (or list of shape tuples if the layer has multiple inputs).
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_losses_for
get_losses_for(inputs)
Retrieves losses relevant to a specific set of inputs.
Arguments:
inputs
: Input tensor or list/tuple of input tensors.
Returns:
List of loss tensors of the layer that depend on inputs
.
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_output_at
get_output_at(node_index)
Retrieves the output tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A tensor (or list of tensors if the layer has multiple outputs).
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_output_mask_at
get_output_mask_at(node_index)
Retrieves the output mask tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A mask tensor (or list of tensors if the layer has multiple outputs).
tf.contrib.rnn.AttentionCellWrapper.get_output_shape_at
get_output_shape_at(node_index)
Retrieves the output shape(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A shape tuple (or list of shape tuples if the layer has multiple outputs).
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_updates_for
get_updates_for(inputs)
Retrieves updates relevant to a specific set of inputs.
Arguments:
inputs
: Input tensor or list/tuple of input tensors.
Returns:
List of update ops of the layer that depend on inputs
.
Raises:
RuntimeError
: If called in Eager mode.
tf.contrib.rnn.AttentionCellWrapper.get_weights
get_weights()
Returns the current weights of the layer.
Returns:
Weights values as a list of numpy arrays.
tf.contrib.rnn.AttentionCellWrapper.set_weights
set_weights(weights)
Sets the weights of the layer, from Numpy arrays.
Arguments:
weights
: a list of Numpy arrays. The number of arrays and their shape must match number of the dimensions of the weights of the layer (i.e. it should match the output ofget_weights
).
Raises:
ValueError
: If the provided weights list does not match the layer's specifications.
tf.contrib.rnn.AttentionCellWrapper.zero_state
zero_state(
batch_size,
dtype
)
Return zero-filled state tensor(s).
Args:
batch_size
: int, float, or unit Tensor representing the batch size.dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a
N-D
tensor of shape [batch_size, state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D
tensors with
the shapes [batch_size, s]
for each s in state_size
.