Embedding¶
- class paddle.nn. Embedding ( num_embeddings, embedding_dim, padding_idx=None, sparse=False, weight_attr=None, name=None ) [source]
-
Embedding Layer, used to construct a callable object of the
Embedding
class. For specific usage, refer to code examples. It implements the function of the Embedding Layer. This layer is used to lookup embeddings vector of ids provided byx
. It automatically constructs a 2D embedding matrix based on the inputnum_embeddings
andembedding_dim
.The shape of output Tensor is generated by appending an emb_size dimension to the last dimension of the input Tensor shape.
Note
The id in
x
must satisfy \(0 =< id < num_embeddings\) , otherwise the program will throw an exception and exit.Case 1: x is a Tensor. padding_idx = -1 x.data = [[1, 3], [2, 4], [4, 127] x.shape = [3, 2] Given size = [128, 16] output is a Tensor: out.shape = [3, 2, 16] out.data = [[[0.129435295, 0.244512452, ..., 0.436322452], [0.345421456, 0.524563927, ..., 0.144534654]], [[0.345249859, 0.124939536, ..., 0.194353745], [0.945345345, 0.435394634, ..., 0.435345365]], [[0.945345345, 0.435394634, ..., 0.435345365], [0.0, 0.0, ..., 0.0 ]]] # padding data The input padding_idx is less than 0, it is automatically converted to padding_idx = -1 + 128 = 127 It will pad all-zero data when ids is 127.
- Parameters
-
num_embeddings (int) – Just one element which indicate the size of the dictionary of embeddings.
embedding_dim (int) – Just one element which indicate the size of each embedding vector respectively.
padding_idx (int|long|None, optional) – padding_idx needs to be in the interval [-num_embeddings, num_embeddings). If \(padding\_idx < 0\), the \(padding\_idx\) will automatically be converted to \(vocab\_size + padding\_idx\) . It will output all-zero padding data whenever lookup encounters \(padding\_idx\) in id. And the padding data will not be updated while training. If set None, it makes no effect to output. Default: None.
sparse (bool, optional) – The flag indicating whether to use sparse update. This parameter only affects the performance of the backwards gradient update. It is recommended to set True because sparse update is faster. But some optimizer does not support sparse update, such as api_paddle_optimizer_adadelta_Adadelta , api_paddle_optimizer_adamax_Adamax , api_paddle_optimizer_lamb_Lamb. In these case, sparse must be False. Default: False.
weight_attr (ParamAttr, optional) – To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in ParamAttr . In addition, user-defined or pre-trained word vectors can be loaded with the
param_attr
parameter. The local word vector needs to be transformed into numpy format, and the shape of local word vector should be consistent withnum_embeddings
. Then Assign is used to load custom or pre-trained word vectors. See code example for details.name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
- Attribute:
-
weight (Parameter): the learnable weights of this layer.
- Returns
-
None
Examples
>>> import paddle >>> x = paddle.to_tensor([[0], [1], [3]], dtype="int64", stop_gradient=False) >>> embedding = paddle.nn.Embedding(4, 3, sparse=True) >>> w0 = paddle.to_tensor([[0., 0., 0.], ... [1., 1., 1.], ... [2., 2., 2.], ... [3., 3., 3.]], dtype="float32") >>> embedding.weight.set_value(w0) >>> print(embedding.weight) Parameter containing: Tensor(shape=[4, 3], dtype=float32, place=Place(cpu), stop_gradient=False, [[0., 0., 0.], [1., 1., 1.], [2., 2., 2.], [3., 3., 3.]]) >>> adam = paddle.optimizer.Adam(parameters=[embedding.weight], learning_rate=0.01) >>> adam.clear_grad() >>> out = embedding(x) >>> print(out) Tensor(shape=[3, 1, 3], dtype=float32, place=Place(cpu), stop_gradient=False, [[[0., 0., 0.]], [[1., 1., 1.]], [[3., 3., 3.]]]) >>> out.backward() >>> adam.step()
-
forward
(
x
)
forward¶
-
Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
-
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
-
extra_repr
(
)
extra_repr¶
-
Extra representation of this layer, you can have custom implementation of your own layer.