lstm_unit¶

paddle.fluid.layers. lstm_unit ( x_t, hidden_t_prev, cell_t_prev, forget_bias=0.0, param_attr=None, bias_attr=None, name=None ) [source]

api_attr

Static Graph

Long-Short Term Memory (LSTM) RNN cell. This operator performs LSTM calculations for one time step, whose implementation is based on calculations described in RECURRENT NEURAL NETWORK REGULARIZATION .

We add forget_bias to the biases of the forget gate in order to reduce the scale of forgetting. The formula is as follows:

\[ \begin{align}\begin{aligned}i_{t} & = \sigma(W_{x_{i}}x_{t} + W_{h_{i}}h_{t-1} + b_{i})\\\begin{split}f_{t} & = \sigma(W_{x_{f}}x_{t} + W_{h_{f}}h_{t-1} + b_{f} + forget\\_bias)\end{split}\\c_{t} & = f_{t}c_{t-1} + i_{t} tanh (W_{x_{c}}x_{t} + W_{h_{c}}h_{t-1} + b_{c})\\o_{t} & = \sigma(W_{x_{o}}x_{t} + W_{h_{o}}h_{t-1} + b_{o})\\h_{t} & = o_{t} tanh (c_{t})\end{aligned}\end{align} \]

\(x_{t}\) stands for x_t , corresponding to the input of current time step; \(h_{t-1}\) and \(c_{t-1}\) correspond to hidden_t_prev and cell_t_prev , representing the output of from previous time step. \(i_{t}, f_{t}, c_{t}, o_{t}, h_{t}\) are input gate, forget gate, cell, output gate and hidden calculation.

Parameters

x_t (Variable) – A 2D Tensor representing the input of current time step. Its shape should be \([N, M]\) , where \(N\) stands for batch size, \(M\) for the feature size of input. The data type should be float32 or float64.
hidden_t_prev (Variable) – A 2D Tensor representing the hidden value from previous step. Its shape should be \([N, D]\) , where \(N\) stands for batch size, \(D\) for the hidden size. The data type should be same as x_t .
cell_t_prev (Variable) – A 2D Tensor representing the cell value from previous step. It has the same shape and data type with hidden_t_prev .
forget_bias (float, optional) – \(forget\\_bias\) added to the biases of the forget gate. Default 0.
param_attr (ParamAttr, optional) – To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr .
bias_attr (ParamAttr, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. See usage for details in api_fluid_ParamAttr .
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.

Returns

The tuple contains two Tensor variables with the same shape and: data type with hidden_t_prev , representing the hidden value and cell value which correspond to \(h_{t}\) and \(c_{t}\) in the formula.

Return type

tuple

Raises

ValueError – Rank of x_t must be 2.
ValueError – Rank of hidden_t_prev must be 2.
ValueError – Rank of cell_t_prev must be 2.
ValueError – The 1st dimensions of x_t, hidden_t_prev and cell_t_prev must be the same.
ValueError – The 2nd dimensions of hidden_t_prev and cell_t_prev must be the same.

Examples

import paddle.fluid as fluid

dict_dim, emb_dim, hidden_dim = 128, 64, 512
data = fluid.data(name='step_data', shape=[None], dtype='int64')
x = fluid.embedding(input=data, size=[dict_dim, emb_dim])
pre_hidden = fluid.data(
    name='pre_hidden', shape=[None, hidden_dim], dtype='float32')
pre_cell = fluid.data(
    name='pre_cell', shape=[None, hidden_dim], dtype='float32')
hidden = fluid.layers.lstm_unit(
    x_t=x,
    hidden_t_prev=pre_hidden,
    cell_t_prev=pre_cell)