OpenCV
4.1.0
Open Source Computer Vision
|
LSTM recurrent layer. More...
#include <opencv2/dnn/all_layers.hpp>
Public Member Functions | |
int | inputNameToIndex (String inputName) CV_OVERRIDE |
Returns index of input blob into the input array. | |
int | outputNameToIndex (const String &outputName) CV_OVERRIDE |
Returns index of output blob in output array. | |
virtual void | setOutShape (const MatShape &outTailShape=MatShape())=0 |
Specifies shape of output blob which will be [[T ], N ] + outTailShape . | |
virtual void | setProduceCellOutput (bool produce=false)=0 |
If this flag is set to true then layer will produce \( c_t \) as second output. | |
virtual void | setUseTimstampsDim (bool use=true)=0 |
Specifies either interpret first dimension of input blob as timestamp dimenion either as sample. | |
virtual void | setWeights (const Mat &Wh, const Mat &Wx, const Mat &b)=0 |
Set trained weights for LSTM layer. | |
Public Member Functions inherited from cv::dnn::Layer | |
Layer () | |
Layer (const LayerParams ¶ms) | |
Initializes only name, type and blobs fields. | |
virtual | ~Layer () |
virtual void | applyHalideScheduler (Ptr< BackendNode > &node, const std::vector< Mat * > &inputs, const std::vector< Mat > &outputs, int targetId) const |
Automatic Halide scheduling based on layer hyper-parameters. | |
virtual void | finalize (const std::vector< Mat * > &input, std::vector< Mat > &output) |
Computes and sets internal parameters according to inputs, outputs and blobs. | |
virtual void | finalize (InputArrayOfArrays inputs, OutputArrayOfArrays outputs) |
Computes and sets internal parameters according to inputs, outputs and blobs. | |
void | finalize (const std::vector< Mat > &inputs, std::vector< Mat > &outputs) |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< Mat > | finalize (const std::vector< Mat > &inputs) |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
virtual void | forward (std::vector< Mat * > &input, std::vector< Mat > &output, std::vector< Mat > &internals) |
Given the input blobs, computes the output blobs . | |
virtual void | forward (InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals) |
Given the input blobs, computes the output blobs . | |
void | forward_fallback (InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals) |
Given the input blobs, computes the output blobs . | |
virtual int64 | getFLOPS (const std::vector< MatShape > &inputs, const std::vector< MatShape > &outputs) const |
virtual bool | getMemoryShapes (const std::vector< MatShape > &inputs, const int requiredOutputs, std::vector< MatShape > &outputs, std::vector< MatShape > &internals) const |
virtual void | getScaleShift (Mat &scale, Mat &shift) const |
Returns parameters of layers with channel-wise multiplication and addition. | |
virtual Ptr< BackendNode > | initHalide (const std::vector< Ptr< BackendWrapper > > &inputs) |
Returns Halide backend node. | |
virtual Ptr< BackendNode > | initInfEngine (const std::vector< Ptr< BackendWrapper > > &inputs) |
virtual Ptr< BackendNode > | initVkCom (const std::vector< Ptr< BackendWrapper > > &inputs) |
void | run (const std::vector< Mat > &inputs, std::vector< Mat > &outputs, std::vector< Mat > &internals) |
Allocates layer and computes output. | |
virtual bool | setActivation (const Ptr< ActivationLayer > &layer) |
Tries to attach to the layer the subsequent activation layer, i.e. do the layer fusion in a partial case. | |
void | setParamsFrom (const LayerParams ¶ms) |
Initializes only name, type and blobs fields. | |
virtual bool | supportBackend (int backendId) |
Ask layer if it support specific backend for doing computations. | |
virtual Ptr< BackendNode > | tryAttach (const Ptr< BackendNode > &node) |
Implement layers fusing. | |
virtual bool | tryFuse (Ptr< Layer > &top) |
Try to fuse current layer with a next one. | |
virtual void | unsetAttached () |
"Deattaches" all the layers, attached to particular layer. | |
Public Member Functions inherited from cv::Algorithm | |
Algorithm () | |
virtual | ~Algorithm () |
virtual void | clear () |
Clears the algorithm state. | |
virtual bool | empty () const |
Returns true if the Algorithm is empty (e.g. in the very beginning or after unsuccessful read. | |
virtual String | getDefaultName () const |
virtual void | read (const FileNode &fn) |
Reads algorithm parameters from a file storage. | |
virtual void | save (const String &filename) const |
virtual void | write (FileStorage &fs) const |
Stores algorithm parameters in a file storage. | |
void | write (const Ptr< FileStorage > &fs, const String &name=String()) const |
simplified API for language bindingsThis is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
Static Public Member Functions | |
static Ptr< LSTMLayer > | create (const LayerParams ¶ms) |
Additional Inherited Members | |
Public Attributes inherited from cv::dnn::Layer | |
std::vector< Mat > | blobs |
List of learned parameters must be stored here to allow read them by using Net::getParam(). | |
String | name |
Name of the layer instance, can be used for logging or other internal purposes. | |
int | preferableTarget |
prefer target for layer forwarding | |
String | type |
Type name which was used for creating layer by layer factory. | |
Protected Member Functions inherited from cv::Algorithm | |
void | writeFormat (FileStorage &fs) const |
LSTM recurrent layer.
|
static |
Creates instance of LSTM layer
|
virtual |
Returns index of input blob into the input array.
inputName | label of input blob |
Each layer input and output can be labeled to easily identify them using "%<layer_name%>[.output_name]" notation. This method maps label of input blob to its index into input vector.
Reimplemented from cv::dnn::Layer.
|
virtual |
Returns index of output blob in output array.
Reimplemented from cv::dnn::Layer.
|
pure virtual |
Specifies shape of output blob which will be [[T
], N
] + outTailShape
.
If this parameter is empty or unset then outTailShape
= [Wh
.size(0)] will be used, where Wh
is parameter from setWeights().
|
pure virtual |
If this flag is set to true then layer will produce \( c_t \) as second output.
use_timestamp_dim
in LayerParams. Shape of the second output is the same as first output.
|
pure virtual |
Specifies either interpret first dimension of input blob as timestamp dimenion either as sample.
produce_cell_output
in LayerParams. If flag is set to true then shape of input blob will be interpreted as [T
, N
, [data dims]
] where T
specifies number of timestamps, N
is number of independent streams. In this case each forward() call will iterate through T
timestamps and update layer's state T
times.
If flag is set to false then shape of input blob will be interpreted as [N
, [data dims]
]. In this case each forward() call will make one iteration and produce one timestamp with shape [N
, [out dims]
].
|
pure virtual |
Set trained weights for LSTM layer.
LSTM behavior on each step is defined by current input, previous output, previous cell state and learned weights.
Let \(x_t\) be current input, \(h_t\) be current output, \(c_t\) be current state. Than current output and current cell state is computed as follows:
\begin{eqnarray*} h_t &= o_t \odot tanh(c_t), \\ c_t &= f_t \odot c_{t-1} + i_t \odot g_t, \\ \end{eqnarray*}
where \(\odot\) is per-element multiply operation and \(i_t, f_t, o_t, g_t\) is internal gates that are computed using learned wights.
Gates are computed as follows:
\begin{eqnarray*} i_t &= sigmoid&(W_{xi} x_t + W_{hi} h_{t-1} + b_i), \\ f_t &= sigmoid&(W_{xf} x_t + W_{hf} h_{t-1} + b_f), \\ o_t &= sigmoid&(W_{xo} x_t + W_{ho} h_{t-1} + b_o), \\ g_t &= tanh &(W_{xg} x_t + W_{hg} h_{t-1} + b_g), \\ \end{eqnarray*}
where \(W_{x?}\), \(W_{h?}\) and \(b_{?}\) are learned weights represented as matrices: \(W_{x?} \in R^{N_h \times N_x}\), \(W_{h?} \in R^{N_h \times N_h}\), \(b_? \in R^{N_h}\).
For simplicity and performance purposes we use \( W_x = [W_{xi}; W_{xf}; W_{xo}, W_{xg}] \) (i.e. \(W_x\) is vertical concatenation of \( W_{x?} \)), \( W_x \in R^{4N_h \times N_x} \). The same for \( W_h = [W_{hi}; W_{hf}; W_{ho}, W_{hg}], W_h \in R^{4N_h \times N_h} \) and for \( b = [b_i; b_f, b_o, b_g]\), \(b \in R^{4N_h} \).
Wh | is matrix defining how previous output is transformed to internal gates (i.e. according to above mentioned notation is \( W_h \)) |
Wx | is matrix defining how current input is transformed to internal gates (i.e. according to above mentioned notation is \( W_x \)) |
b | is bias vector (i.e. according to above mentioned notation is \( b \)) |