lstm_unit¶
- paddle.fluid.layers. lstm_unit ( x_t, hidden_t_prev, cell_t_prev, forget_bias=0.0, param_attr=None, bias_attr=None, name=None ) [source]
-
- api_attr
-
Static Graph
Long-Short Term Memory (LSTM) RNN cell. This operator performs LSTM calculations for one time step, whose implementation is based on calculations described in RECURRENT NEURAL NETWORK REGULARIZATION .
We add forget_bias to the biases of the forget gate in order to reduce the scale of forgetting. The formula is as follows:
\[ \begin{align}\begin{aligned}i_{t} & = \sigma(W_{x_{i}}x_{t} + W_{h_{i}}h_{t-1} + b_{i})\\\begin{split}f_{t} & = \sigma(W_{x_{f}}x_{t} + W_{h_{f}}h_{t-1} + b_{f} + forget\\_bias)\end{split}\\c_{t} & = f_{t}c_{t-1} + i_{t} tanh (W_{x_{c}}x_{t} + W_{h_{c}}h_{t-1} + b_{c})\\o_{t} & = \sigma(W_{x_{o}}x_{t} + W_{h_{o}}h_{t-1} + b_{o})\\h_{t} & = o_{t} tanh (c_{t})\end{aligned}\end{align} \]\(x_{t}\) stands for
x_t
, corresponding to the input of current time step; \(h_{t-1}\) and \(c_{t-1}\) correspond tohidden_t_prev
andcell_t_prev
, representing the output of from previous time step. \(i_{t}, f_{t}, c_{t}, o_{t}, h_{t}\) are input gate, forget gate, cell, output gate and hidden calculation.- Parameters
-
x_t (Variable) – A 2D Tensor representing the input of current time step. Its shape should be \([N, M]\) , where \(N\) stands for batch size, \(M\) for the feature size of input. The data type should be float32 or float64.
hidden_t_prev (Variable) – A 2D Tensor representing the hidden value from previous step. Its shape should be \([N, D]\) , where \(N\) stands for batch size, \(D\) for the hidden size. The data type should be same as
x_t
.cell_t_prev (Variable) – A 2D Tensor representing the cell value from previous step. It has the same shape and data type with
hidden_t_prev
.forget_bias (float, optional) – \(forget\\_bias\) added to the biases of the forget gate. Default 0.
param_attr (ParamAttr, optional) – To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr .
bias_attr (ParamAttr, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. See usage for details in api_fluid_ParamAttr .
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
- Returns
-
- The tuple contains two Tensor variables with the same shape and
-
data type with
hidden_t_prev
, representing the hidden value and cell value which correspond to \(h_{t}\) and \(c_{t}\) in the formula.
- Return type
-
tuple
- Raises
-
ValueError – Rank of x_t must be 2.
ValueError – Rank of hidden_t_prev must be 2.
ValueError – Rank of cell_t_prev must be 2.
ValueError – The 1st dimensions of x_t, hidden_t_prev and cell_t_prev must be the same.
ValueError – The 2nd dimensions of hidden_t_prev and cell_t_prev must be the same.
Examples
import paddle.fluid as fluid dict_dim, emb_dim, hidden_dim = 128, 64, 512 data = fluid.data(name='step_data', shape=[None], dtype='int64') x = fluid.embedding(input=data, size=[dict_dim, emb_dim]) pre_hidden = fluid.data( name='pre_hidden', shape=[None, hidden_dim], dtype='float32') pre_cell = fluid.data( name='pre_cell', shape=[None, hidden_dim], dtype='float32') hidden = fluid.layers.lstm_unit( x_t=x, hidden_t_prev=pre_hidden, cell_t_prev=pre_cell)