sequence_pad¶
- paddle.static.nn. sequence_pad ( x, pad_value, maxlen=None, name=None ) [source]
-
This layer padding the sequences in a same batch to a common length (according to
maxlen
). The padding value is defined bypad_value
, and will be appended to the tail of sequences. The result is a Python tuple(Out, Length)
: the TensorOut
is the padded sequences, and TensorLength
is the length information of input sequences. For removing padding data (unpadding operation), See sequence_unpad.Note
Please note that the input
x
should be Tensor.Case 1: Given input 1-level Tensor x: x.lod = [[0, 2, 5]] x.data = [[a],[b],[c],[d],[e]] pad_value: pad_value.data = [0] maxlen = 4 the output tuple (Out, Length): Out.data = [[[a],[b],[0],[0]],[[c],[d],[e],[0]]] Length.data = [2, 3] #Original sequences length Case 2: Given input 1-level Tensor x: x.lod = [[0, 2, 5]] x.data = [[a1,a2],[b1,b2],[c1,c2],[d1,d2],[e1,e2]] pad_value: pad_value.data = [0] default maxlen = None, (the virtual value is 3, according to the shape of x) the output tuple (Out, Length): Out.data = [[[a1,a2],[b1,b2],[0,0]],[[c1,c2],[d1,d2],[e1,e2]]] Length.data = [2, 3] Case 3: Given input 1-level Tensor x: x.lod = [[0, 2, 5]] x.data = [[a1,a2],[b1,b2],[c1,c2],[d1,d2],[e1,e2]] pad_value: pad_value.data = [p1,p2] default maxlen = None, (the virtual value is 3) get tuple (Out, Length): Out.data = [[[a1,a2],[b1,b2],[p1,p2]],[[c1,c2],[d1,d2],[e1,e2]]] Length.data = [2, 3]
- Parameters
-
x (Tensor) – Input 1-level Tensor with dims
[M, K]
. The batch size is described by lod infor (the number of sequences ). The data type should be float32, float64, int8, int32 or int64.pad_value (Tensor) – Padding value. It can be a scalar or a 1D tensor with length
K
. If it’s a scalar, it will be automatically broadcasted to a Tensor. The data type should be as same asx
.maxlen (int, optional) – The length of padded sequences, None by default. When it is None, all sequences will be padded up to the length of the longest one among them; when it a certain positive value, it must be greater than the length of the longest original sequence.
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
- Returns
-
the 1st is a 0 level Tensor
Out
, with the shape[batch_size, maxlen, K]
; the second is the original sequences length inforLength
, which should be a 0-level 1D Tensor. The size ofLength
is equal to batch size, and the data type is int64. - Return type
-
tuple, A Python tuple (Out, Length)
Examples
>>> import paddle >>> paddle.enable_static() >>> import paddle.base as base >>> import numpy >>> x = paddle.static.data(name='x', shape=[10, 5], dtype='float32', lod_level=1) >>> pad_value = paddle.assign( ... numpy.array([0.0], dtype=numpy.float32)) >>> out = paddle.static.nn.sequence_pad(x=x, pad_value=pad_value)