chainer.functions.lstm¶
- 
chainer.functions.lstm(c_prev, x)[source]¶
- Long Short-Term Memory units as an activation function. - This function implements LSTM units with forget gates. Let the previous cell state - c_prevand the input array- x.- First, the input array - xis split into four arrays \(a, i, f, o\) of the same shapes along the second axis. It means that- x‘s second axis must have 4 times the- c_prev‘s second axis.- The split input arrays are corresponding to: - \(a\) : sources of cell input 
- \(i\) : sources of input gate 
- \(f\) : sources of forget gate 
- \(o\) : sources of output gate 
 - Second, it computes the updated cell state - cand the outgoing signal- has:\[\begin{split}c &= \tanh(a) \sigma(i) + c_{\text{prev}} \sigma(f), \\ h &= \tanh(c) \sigma(o),\end{split}\]- where \(\sigma\) is the elementwise sigmoid function. These are returned as a tuple of two variables. - This function supports variable length inputs. The mini-batch size of the current input must be equal to or smaller than that of the previous one. When mini-batch size of - xis smaller than that of- c, this function only updates- c[0:len(x)]and doesn’t change the rest of- c,- c[len(x):]. So, please sort input sequences in descending order of lengths before applying the function.- Parameters
- c_prev ( - Variableor N-dimensional array) – Variable that holds the previous cell state. The cell state should be a zero array or the output of the previous call of LSTM.
- x ( - Variableor N-dimensional array) – Variable that holds the sources of cell input, input gate, forget gate and output gate. It must have the second dimension whose size is four times of that of the cell state.
 
- Returns
- Two - Variableobjects- cand- h.- cis the updated cell state.- hindicates the outgoing signal.
- Return type
 - See the original paper proposing LSTM with forget gates: Long Short-Term Memory in Recurrent Neural Networks. - See also - Example - Assuming - yis the current incoming signal,- cis the previous cell state, and- his the previous outgoing signal from an- lstmfunction. Each of- y,- cand- hhas- n_unitschannels. Most typical preparation of- xis:- >>> n_units = 100 >>> y = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> h = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> c = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> model = chainer.Chain() >>> with model.init_scope(): ... model.w = L.Linear(n_units, 4 * n_units) ... model.v = L.Linear(n_units, 4 * n_units) >>> x = model.w(y) + model.v(h) >>> c, h = F.lstm(c, x) - It corresponds to calculate the input array - x, or the input sources \(a, i, f, o\), from the current incoming signal- yand the previous outgoing signal- h. Different parameters are used for different kind of input sources.- Note - We use the naming rule below. - incoming signal
- The formal input of the formulation of LSTM (e.g. in NLP, word vector or output of lower RNN layer). The input of - chainer.links.LSTMis the incoming signal.
 
- input array
- The array which is linear transformed from incoming signal and the previous outgoing signal. The input array contains four sources, the sources of cell input, input gate, forget gate and output gate. The input of - chainer.functions.activation.lstm.LSTMis the input array.