chainer.functions.slstm¶
-
chainer.functions.
slstm
(c_prev1, c_prev2, x1, x2)[source]¶ S-LSTM units as an activation function.
This function implements S-LSTM unit. It is an extension of LSTM unit applied to tree structures. The function is applied to binary trees. Each node has two child nodes. It gets four arguments, previous cell states
c_prev1
andc_prev2
, and input arraysx1
andx2
.First both input arrays
x1
andx2
are split into eight arrays a1,i1,f1,o1, and a2,i2,f2,o2. They have the same shape along the second axis. It means thatx1
andx2
‘s second axis must have 4 times the length ofc_prev1
andc_prev2
.The split input arrays are corresponding to:
ai : sources of cell input
ii : sources of input gate
fi : sources of forget gate
oi : sources of output gate
It computes the updated cell state
c
and the outgoing signalh
as:c=tanh(a1+a2)σ(i1+i2)+cprev1σ(f1)+cprev2σ(f2),h=tanh(c)σ(o1+o2),where σ is the elementwise sigmoid function. The function returns
c
andh
as a tuple.- Parameters
c_prev1 (
Variable
or N-dimensional array) – Variable that holds the previous cell state of the first child node. The cell state should be a zero array or the output of the previous call of LSTM.c_prev2 (
Variable
or N-dimensional array) – Variable that holds the previous cell state of the second child node.x1 (
Variable
or N-dimensional array) – Variable that holds the sources of cell input, input gate, forget gate and output gate from the first child node. It must have the second dimension whose size is four times of that of the cell state.x2 (
Variable
or N-dimensional array) – Variable that holds the input sources from the second child node.
- Returns
Two
Variable
objectsc
andh
.c
is the cell state.h
indicates the outgoing signal.- Return type
See detail in paper: Long Short-Term Memory Over Tree Structures.
Example
Assuming
c1
,c2
is the previous cell state of children, andh1
,h2
is the previous outgoing signal from children. Each ofc1
,c2
,h1
andh2
hasn_units
channels. Most typical preparation ofx1
,x2
is:>>> n_units = 100 >>> h1 = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> h2 = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> c1 = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> c2 = chainer.Variable(np.zeros((1, n_units), np.float32)) >>> model1 = chainer.Chain() >>> with model1.init_scope(): ... model1.w = L.Linear(n_units, 4 * n_units) ... model1.v = L.Linear(n_units, 4 * n_units) >>> model2 = chainer.Chain() >>> with model2.init_scope(): ... model2.w = L.Linear(n_units, 4 * n_units) ... model2.v = L.Linear(n_units, 4 * n_units) >>> x1 = model1.w(c1) + model1.v(h1) >>> x2 = model2.w(c2) + model2.v(h2) >>> c, h = F.slstm(c1, c2, x1, x2)
It corresponds to calculate the input array
x1
, or the input sources a1,i1,f1,o1 from the previous cell state of first child nodec1
, and the previous outgoing signal from first child nodeh1
. Different parameters are used for different kind of input sources.