Conv3D¶
- class paddle.sparse.nn. Conv3D ( in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, padding_mode='zeros', weight_attr=None, bias_attr=None, data_format='NDHWC' ) [source]
-
Sparse Convolution3d Layer The Sparse convolution3d layer calculates the output based on the input, filter and strides, paddings, dilations, groups parameters. Input(Input) and Output(Output) are multidimensional SparseCooTensors with a shape of \([N, D, H, W, C]\) . Where N is batch size, C is the number of channels, D is the depth of the feature, H is the height of the feature, and W is the width of the feature. If bias attribution is provided, bias is added to the output of the convolution. For each input \(X\), the equation is:
\[Out = W \ast X + b\]In the above equation:
\(X\): Input value, a tensor with NDHWC format.
\(W\): Filter value, a tensor with DHWCM format.
\(\\ast\): Convolution operation.
\(b\): Bias value, a 1-D tensor with shape [M].
\(Out\): Output value, the shape of \(Out\) and \(X\) may be different.
- Parameters
-
in_channels (int) – The number of input channels in the input image.
out_channels (int) – The number of output channels produced by the convolution.
kernel_size (int|list|tuple) – The size of the convolving kernel.
stride (int|list|tuple, optional) – The stride size. If stride is a list/tuple, it must contain three integers, (stride_D, stride_H, stride_W). Otherwise, the stride_D = stride_H = stride_W = stride. The default value is 1.
padding (int|str|tuple|list, optional) – The padding size. Padding couple be in one of the following forms. 1. a string in [‘valid’, ‘same’]. 2. an int, which means each spatial dimension(depth, height, width) is zero padded by size of padding 3. a list[int] or tuple[int] whose length is the number of spatial dimensions, which contains the amount of padding on each side for each spatial dimension. It has the form [pad_d1, pad_d2, …]. 4. a list[int] or tuple[int] whose length is 2 * number of spatial dimensions. It has the form [pad_before, pad_after, pad_before, pad_after, …] for all spatial dimensions. 5. a list or tuple of pairs of ints. It has the form [[pad_before, pad_after], [pad_before, pad_after], …]. Note that, the batch dimension and channel dimension are also included. Each pair of integers correspond to the amount of padding for a dimension of the input. Padding in batch dimension and channel dimension should be [0, 0] or (0, 0). The default value is 0.
dilation (int|list|tuple, optional) – The dilation size. If dilation is a list/tuple, it must contain three integers, (dilation_D, dilation_H, dilation_W). Otherwise, the dilation_D = dilation_H = dilation_W = dilation. The default value is 1.
groups (int, optional) – The groups number of the Conv3D Layer. According to grouped convolution in Alex Krizhevsky’s Deep CNN paper: when group=2, the first half of the filters is only connected to the first half of the input channels, while the second half of the filters is only connected to the second half of the input channels. The default value is 1, currently, only support groups=1.
padding_mode (str, optional) –
'zeros'
,'reflect'
,'replicate'
or'circular'
. Currently only support'zeros'
.weight_attr (ParamAttr, optional) – The parameter attribute for learnable parameters/weights of conv3d. If it is set to None or one attribute of ParamAttr, conv3d will create ParamAttr as param_attr. If it is set to None, the parameter is initialized with \(Normal(0.0, std)\), and the \(std\) is \((\frac{2.0 }{filter\_elem\_num})^{0.5}\). The default value is None.
bias_attr (ParamAttr|bool, optional) – The parameter attribute for the bias of conv3d. If it is set to False, no bias will be added to the output units. If it is set to None or one attribute of ParamAttr, conv3d will create ParamAttr as bias_attr. If the Initializer of the bias_attr is not set, the bias is initialized zero. The default value is None.
data_format (str, optional) – Data format that specifies the layout of input. It can be “NCDHW” or “NDHWC”. Currently, only support “NCDHW”.
Attribute:
weight (Parameter): the learnable weights of filters of this layer.
bias (Parameter): the learnable bias of this layer.
Shape:
x: \((N, D_{in}, H_{in}, W_{in}, C_{in})\)
weight: \((K_{d}, K_{h}, K_{w}, C_{in}, C_{out})\)
bias: \((C_{out})\)
output: \((N, D_{out}, H_{out}, W_{out}, C_{out})\)
Where
\[ \begin{align}\begin{aligned}D_{out}&= \frac{(D_{in} + 2 * paddings[0] - (dilations[0] * (kernel\_size[0] - 1) + 1))}{strides[0]} + 1\\H_{out}&= \frac{(H_{in} + 2 * paddings[1] - (dilations[1] * (kernel\_size[1] - 1) + 1))}{strides[1]} + 1\\W_{out}&= \frac{(W_{in} + 2 * paddings[2] - (dilations[2] * (kernel\_size[2] - 1) + 1))}{strides[2]} + 1\end{aligned}\end{align} \]Examples
>>> import paddle >>> indices = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 1, 2], [1, 3, 2, 3]] >>> values = [[1], [2], [3], [4]] >>> indices = paddle.to_tensor(indices, dtype='int32') >>> values = paddle.to_tensor(values, dtype='float32') >>> dense_shape = [1, 1, 3, 4, 1] >>> sparse_x = paddle.sparse.sparse_coo_tensor(indices, values, dense_shape, stop_gradient=True) >>> conv = paddle.sparse.nn.Conv3D(1, 1, (1, 3, 3)) >>> y = conv(sparse_x) >>> print(y.shape) [1, 1, 1, 2, 1]
-
add_parameter
(
name,
parameter
)
add_parameter¶
-
Adds a Parameter instance.
Added parameter can be accessed by self.name
- Parameters
-
name (str) – name of this sublayer.
parameter (Parameter) – an instance of Parameter.
- Returns
-
Parameter, the parameter passed in.
Examples
>>> import paddle >>> paddle.seed(100) >>> class MyLayer(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self._linear = paddle.nn.Linear(1, 1) ... w_tmp = self.create_parameter([1,1]) ... self.add_parameter("w_tmp", w_tmp) ... ... def forward(self, input): ... return self._linear(input) ... >>> mylayer = MyLayer() >>> for name, param in mylayer.named_parameters(): ... print(name, param) w_tmp Parameter containing: Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[-1.01448846]]) _linear.weight Parameter containing: Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[0.18551230]]) _linear.bias Parameter containing: Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False, [0.])
-
add_sublayer
(
name,
sublayer
)
add_sublayer¶
-
Adds a sub Layer instance.
Added sublayer can be accessed by self.name
- Parameters
-
name (str) – name of this sublayer.
sublayer (Layer) – an instance of Layer.
- Returns
-
Layer, the sublayer passed in.
Examples
>>> import paddle >>> class MySequential(paddle.nn.Layer): ... def __init__(self, *layers): ... super().__init__() ... if len(layers) > 0 and isinstance(layers[0], tuple): ... for name, layer in layers: ... self.add_sublayer(name, layer) ... else: ... for idx, layer in enumerate(layers): ... self.add_sublayer(str(idx), layer) ... ... def forward(self, input): ... for layer in self._sub_layers.values(): ... input = layer(input) ... return input ... >>> fc1 = paddle.nn.Linear(10, 3) >>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False) >>> model = MySequential(fc1, fc2) >>> for prefix, layer in model.named_sublayers(): ... print(prefix, layer) 0 Linear(in_features=10, out_features=3, dtype=float32) 1 Linear(in_features=3, out_features=10, dtype=float32)
-
apply
(
fn
)
apply¶
-
Applies
fn
recursively to every sublayer (as returned by.sublayers()
) as well as self. Typical use includes initializing the parameters of a model.- Parameters
-
fn (function) – a function to be applied to each sublayer
- Returns
-
Layer, self
- Example::
-
>>> import paddle >>> import paddle.nn as nn >>> paddle.seed(2023) >>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2)) >>> def init_weights(layer): ... if type(layer) == nn.Linear: ... print('before init weight:', layer.weight.numpy()) ... new_weight = paddle.full(shape=layer.weight.shape, dtype=layer.weight.dtype, fill_value=0.9) ... layer.weight.set_value(new_weight) ... print('after init weight:', layer.weight.numpy()) ... >>> net.apply(init_weights) >>> print(net.state_dict()) before init weight: [[ 0.89611185 0.04935038] [-0.5888344 0.99266374]] after init weight: [[0.9 0.9] [0.9 0.9]] before init weight: [[-0.18615901 -0.22924072] [ 1.1517721 0.59859073]] after init weight: [[0.9 0.9] [0.9 0.9]] OrderedDict([('0.weight', Parameter containing: Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False, [[0.89999998, 0.89999998], [0.89999998, 0.89999998]])), ('0.bias', Parameter containing: Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False, [0., 0.])), ('1.weight', Parameter containing: Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False, [[0.89999998, 0.89999998], [0.89999998, 0.89999998]])), ('1.bias', Parameter containing: Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False, [0., 0.]))])
-
astype
(
dtype=None
)
astype¶
-
Casts all parameters and buffers to dtype and then return the Layer.
- Parameters
-
dtype (str|paddle.dtype|numpy.dtype) – target data type of layer. If set str, it can be “bool”, “bfloat16”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8”, “complex64”, “complex128”. Default: None
- Returns
-
Layer, self
Examples
>>> import paddle >>> import paddle.nn as nn >>> weight_attr = paddle.ParamAttr(name="weight",initializer=paddle.nn.initializer.Constant(value=1.5)) >>> bias_attr = paddle.ParamAttr(name="bias",initializer=paddle.nn.initializer.Constant(value=2.5)) >>> linear = paddle.nn.Linear(2, 2, weight_attr=weight_attr, bias_attr=bias_attr).to(device="cpu",dtype="float32") >>> print(linear) Linear(in_features=2, out_features=2, dtype=float32) >>> print(linear.parameters()) [Parameter containing: Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False, [[1.50000000, 1.50000000], [1.50000000, 1.50000000]]), Parameter containing: Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False, [2.50000000, 2.50000000])] >>> linear=linear.astype("int8") >>> print(linear) Linear(in_features=2, out_features=2, dtype=paddle.int8) >>> print(linear.parameters()) [Parameter containing: Tensor(shape=[2, 2], dtype=int8, place=Place(cpu), stop_gradient=False, [[1, 1], [1, 1]]), Parameter containing: Tensor(shape=[2], dtype=int8, place=Place(cpu), stop_gradient=False, [2, 2])]
-
bfloat16
(
excluded_layers=None
)
bfloat16¶
-
Casts all floating point parameters and buffers to
bfloat16
data type.Note
nn.BatchNorm
does not supportbfloat16
weights, so it would not be converted by default.- Parameters
-
excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers except
nn.BatchNorm
. Default: None. - Returns
-
self
- Return type
-
Layer
Examples
>>> >>> import paddle >>> class Model(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self.linear = paddle.nn.Linear(1, 1) ... self.dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... out = self.linear(input) ... out = self.dropout(out) ... return out ... >>> model = Model() >>> model.bfloat16() >>> #UserWarning: Paddle compiled by the user does not support bfloat16, so keep original data type. Model( (linear): Linear(in_features=1, out_features=1, dtype=float32) (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train) )
-
buffers
(
include_sublayers=True
)
buffers¶
-
Returns a list of all buffers from current layer and its sub-layers.
- Parameters
-
include_sublayers (bool, optional) – Whether include the buffers of sublayers. If True, also include the buffers from sublayers. Default: True.
- Returns
-
list of Tensor, a list of buffers.
Examples
>>> import numpy as np >>> import paddle >>> linear = paddle.nn.Linear(10, 3) >>> value = np.array([0]).astype("float32") >>> buffer = paddle.to_tensor(value) >>> linear.register_buffer("buf_name", buffer, persistable=True) >>> print(linear.buffers()) [Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [0.])]
-
children
(
)
children¶
-
Returns an iterator over immediate children layers.
- Yields
-
Layer – a child layer
Examples
>>> import paddle >>> linear1 = paddle.nn.Linear(10, 3) >>> linear2 = paddle.nn.Linear(3, 10, bias_attr=False) >>> model = paddle.nn.Sequential(linear1, linear2) >>> layer_list = list(model.children()) >>> print(layer_list) [Linear(in_features=10, out_features=3, dtype=float32), Linear(in_features=3, out_features=10, dtype=float32)]
-
clear_gradients
(
set_to_zero=True
)
clear_gradients¶
-
Clear the gradients of all parameters for this layer.
- Parameters
-
set_to_zero (bool, optional) – Whether to set the trainable parameters’ gradients to zero or None. Default is True.
- Returns
-
None
Examples
>>> import paddle >>> import numpy as np >>> value = np.arange(26).reshape(2, 13).astype("float32") >>> a = paddle.to_tensor(value) >>> linear = paddle.nn.Linear(13, 5) >>> adam = paddle.optimizer.Adam(learning_rate=0.01, ... parameters=linear.parameters()) >>> out = linear(a) >>> out.backward() >>> adam.step() >>> linear.clear_gradients()
-
create_parameter
(
shape,
attr=None,
dtype=None,
is_bias=False,
default_initializer=None
)
create_parameter¶
-
Create parameters for this layer.
- Parameters
-
shape (list) – Shape of the parameter. The data type in the list must be int.
attr (ParamAttr, optional) – Parameter attribute of weight. Please refer to ParamAttr. Default: None.
dtype (str, optional) – Data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. Default: “float32”.
is_bias (bool, optional) – if this is a bias parameter. Default: False.
default_initializer (Initializer, optional) – the default initializer for this parameter. If set None, default initializer will be set to paddle.nn.initializer.Xavier and paddle.nn.initializer.Constant for non-bias and bias parameter, respectively. Default: None.
- Returns
-
Tensor, created parameter.
Examples
>>> import paddle >>> paddle.seed(2023) >>> class MyLayer(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self._linear = paddle.nn.Linear(1, 1) ... w_tmp = self.create_parameter([1,1]) ... self.add_parameter("w_tmp", w_tmp) ... ... def forward(self, input): ... return self._linear(input) ... >>> mylayer = MyLayer() >>> for name, param in mylayer.named_parameters(): ... print(name, param) # will print w_tmp,_linear.weight,_linear.bias w_tmp Parameter containing: Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[0.06979191]]) _linear.weight Parameter containing: Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[1.26729357]]) _linear.bias Parameter containing: Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False, [0.])
-
create_tensor
(
name=None,
persistable=None,
dtype=None
)
create_tensor¶
-
Create Tensor for this layer.
- Parameters
-
name (str, optional) – name of the tensor. Please refer to Name . Default: None.
persistable (bool, optional) – if set this tensor persistable. Default: False.
dtype (str, optional) – data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. If set None, it will be “float32”. Default: None.
- Returns
-
Tensor, created Tensor.
Examples
>>> import paddle >>> class MyLinear(paddle.nn.Layer): ... def __init__(self, ... in_features, ... out_features): ... super().__init__() ... self.linear = paddle.nn.Linear(10, 10) ... ... self.back_var = self.create_tensor(name = "linear_tmp_0", dtype=self._dtype) ... ... def forward(self, input): ... out = self.linear(input) ... paddle.assign(out, self.back_var) ... ... return out
-
create_variable
(
name=None,
persistable=None,
dtype=None
)
create_variable¶
-
Warning
API “paddle.nn.layer.layers.create_variable” is deprecated since 2.0.0, and will be removed in future versions. Please use “paddle.nn.Layer.create_tensor” instead. Reason: New api in create_tensor, easier to use.
Create Tensor for this layer.
- Parameters
-
name (str, optional) – name of the tensor. Please refer to Name . Default: None
persistable (bool, optional) – if set this tensor persistable. Default: False
dtype (str, optional) – data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”,”int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. If set None, it will be “float32”. Default: None
- Returns
-
Tensor, created Tensor.
Examples
>>> import paddle >>> class MyLinear(paddle.nn.Layer): ... def __init__(self, ... in_features, ... out_features): ... super().__init__() ... self.linear = paddle.nn.Linear( 10, 10) ... ... self.back_var = self.create_variable(name = "linear_tmp_0", dtype=self._dtype) ... ... def forward(self, input): ... out = self.linear(input) ... paddle.assign( out, self.back_var) ... ... return out
-
eval
(
)
eval¶
-
Sets this Layer and all its sublayers to evaluation mode. This only effects certain modules like Dropout and BatchNorm.
- Returns
-
None
- Example::
-
>>> import paddle >>> paddle.seed(100) >>> class MyLayer(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self._linear = paddle.nn.Linear(1, 1) ... self._dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... temp = self._linear(input) ... temp = self._dropout(temp) ... return temp ... >>> x = paddle.randn([10, 1], 'float32') >>> mylayer = MyLayer() >>> mylayer.eval() # set mylayer._dropout to eval mode >>> out = mylayer(x) >>> print(out) Tensor(shape=[10, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[-1.72439659], [ 0.31532824], [ 0.01192369], [-0.36912638], [-1.63426113], [-0.93169814], [ 0.32222399], [-1.61092973], [ 0.77209264], [-0.34038994]])
-
extra_repr
(
)
extra_repr¶
-
Extra representation of this layer, you can have custom implementation of your own layer.
-
float
(
excluded_layers=None
)
float¶
-
Casts all floating point parameters and buffers to
float
data type.- Parameters
-
excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers. Default: None.
- Returns
-
self
- Return type
-
Layer
Examples
>>> import paddle >>> class Model(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self.linear = paddle.nn.Linear(1, 1) ... self.dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... out = self.linear(input) ... out = self.dropout(out) ... return out ... >>> model = Model() >>> model.float() Model( (linear): Linear(in_features=1, out_features=1, dtype=paddle.float32) (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train) )
-
float16
(
excluded_layers=None
)
float16¶
-
Casts all floating point parameters and buffers to
float16
data type.Note
nn.BatchNorm
does not supportbfloat16
weights, so it would not be converted by default.- Parameters
-
excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers except
nn.BatchNorm
. Default: None. - Returns
-
self
- Return type
-
Layer
Examples
>>> >>> import paddle >>> class Model(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self.linear = paddle.nn.Linear(1, 1) ... self.dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... out = self.linear(input) ... out = self.dropout(out) ... return out ... >>> model = Model() >>> model.float16() Model( (linear): Linear(in_features=1, out_features=1, dtype=float32) (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train) )
-
forward
(
x
)
forward¶
-
Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
-
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
-
full_name
(
)
full_name¶
-
Full name for this layer, composed by name_scope + “/” + MyLayer.__class__.__name__
- Returns
-
str, full name of this layer.
- Example::
-
>>> import paddle >>> class LinearNet(paddle.nn.Layer): ... def __init__(self): ... super().__init__(name_scope = "demo_linear_net") ... self._linear = paddle.nn.Linear(1, 1) ... ... def forward(self, x): ... return self._linear(x) ... >>> linear_net = LinearNet() >>> print(linear_net.full_name()) demo_linear_net_0
-
load_dict
(
state_dict,
use_structured_name=True
)
load_dict¶
-
Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict
- Parameters
-
state_dict (dict) – Dict contains all the parameters and persistable buffers.
use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.
- Returns
-
A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys
- Return type
-
missing_keys(list)
Examples
>>> import paddle >>> emb = paddle.nn.Embedding(10, 10) >>> state_dict = emb.state_dict() >>> paddle.save(state_dict, "paddle_dy.pdparams") >>> para_state_dict = paddle.load("paddle_dy.pdparams") >>> emb.set_state_dict(para_state_dict)
-
named_buffers
(
prefix='',
include_sublayers=True
)
named_buffers¶
-
Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.
- Parameters
-
prefix (str, optional) – Prefix to prepend to all buffer names. Default: ‘’.
include_sublayers (bool, optional) – Whether include the buffers of sublayers. If True, also include the named buffers from sublayers. Default: True.
- Yields
-
(string, Tensor) – Tuple of name and tensor
Examples
>>> import numpy as np >>> import paddle >>> fc1 = paddle.nn.Linear(10, 3) >>> buffer1 = paddle.to_tensor(np.array([0]).astype("float32")) >>> # register a tensor as buffer by specific `persistable` >>> fc1.register_buffer("buf_name_1", buffer1, persistable=True) >>> fc2 = paddle.nn.Linear(3, 10) >>> buffer2 = paddle.to_tensor(np.array([1]).astype("float32")) >>> # register a buffer by assigning an attribute with Tensor. >>> # The `persistable` can only be False by this way. >>> fc2.buf_name_2 = buffer2 >>> model = paddle.nn.Sequential(fc1, fc2) >>> # get all named buffers >>> for name, buffer in model.named_buffers(): ... print(name, buffer) 0.buf_name_1 Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [0.]) 1.buf_name_2 Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [1.])
-
named_children
(
)
named_children¶
-
Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.
- Yields
-
(string, Layer) – Tuple containing a name and child layer
Examples
>>> import paddle >>> linear1 = paddle.nn.Linear(10, 3) >>> linear2 = paddle.nn.Linear(3, 10, bias_attr=False) >>> model = paddle.nn.Sequential(linear1, linear2) >>> for prefix, layer in model.named_children(): ... print(prefix, layer) 0 Linear(in_features=10, out_features=3, dtype=float32) 1 Linear(in_features=3, out_features=10, dtype=float32)
-
named_parameters
(
prefix='',
include_sublayers=True
)
named_parameters¶
-
Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.
- Parameters
-
prefix (str, optional) – Prefix to prepend to all parameter names. Default: ‘’.
include_sublayers (bool, optional) – Whether include the parameters of sublayers. If True, also include the named parameters from sublayers. Default: True.
- Yields
-
(string, Parameter) – Tuple of name and Parameter
Examples
>>> import paddle >>> paddle.seed(100) >>> fc1 = paddle.nn.Linear(10, 3) >>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False) >>> model = paddle.nn.Sequential(fc1, fc2) >>> for name, param in model.named_parameters(): ... print(name, param) 0.weight Parameter containing: Tensor(shape=[10, 3], dtype=float32, place=Place(cpu), stop_gradient=False, [[ 0.07276392, -0.39791510, -0.66356444], [ 0.02143478, -0.18519843, -0.32485050], [-0.42249614, 0.08450919, -0.66838276], [ 0.38208580, -0.24303678, 0.55127048], [ 0.47745085, 0.62117910, -0.08336520], [-0.28653207, 0.47237599, -0.05868882], [-0.14385653, 0.29945642, 0.12832761], [-0.21237159, 0.38539791, -0.62760031], [ 0.02637231, 0.20621127, 0.43255770], [-0.19984481, -0.26259184, -0.29696006]]) 0.bias Parameter containing: Tensor(shape=[3], dtype=float32, place=Place(cpu), stop_gradient=False, [0., 0., 0.]) 1.weight Parameter containing: Tensor(shape=[3, 10], dtype=float32, place=Place(cpu), stop_gradient=False, [[ 0.01985580, -0.40268910, 0.41172385, -0.47249708, -0.09002256, -0.00533628, -0.52048630, 0.62360322, 0.20848787, -0.02033746], [ 0.58281910, 0.12841827, 0.12907702, 0.02325618, -0.07746267, 0.31950659, -0.37924835, -0.59209681, -0.11732036, -0.58378261], [-0.62100595, 0.22293305, 0.28229684, -0.03687060, -0.59323978, 0.08411229, 0.53275704, 0.40431368, 0.03171402, -0.17922515]])
-
named_sublayers
(
prefix='',
include_self=False,
layers_set=None
)
named_sublayers¶
-
Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer. The duplicate sublayer will only be yielded once.
- Parameters
-
prefix (str, optional) – Prefix to prepend to all parameter names. Default: ‘’.
include_self (bool, optional) – Whether include the Layer itself. Default: False.
layers_set (set, optional) – The set to record duplicate sublayers. Default: None.
- Yields
-
(string, Layer) – Tuple of name and Layer
Examples
>>> import paddle >>> fc1 = paddle.nn.Linear(10, 3) >>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False) >>> model = paddle.nn.Sequential(fc1, fc2) >>> for prefix, layer in model.named_sublayers(): ... print(prefix, layer) 0 Linear(in_features=10, out_features=3, dtype=float32) 1 Linear(in_features=3, out_features=10, dtype=float32)
-
parameters
(
include_sublayers=True
)
parameters¶
-
Returns a list of all Parameters from current layer and its sub-layers.
- Parameters
-
include_sublayers (bool, optional) – Whether to return the parameters of the sublayer. If True, the returned list contains the parameters of the sublayer. Default: True.
- Returns
-
list of Tensor, a list of Parameters.
Examples
>>> import paddle >>> paddle.seed(100) >>> linear = paddle.nn.Linear(1, 1) >>> print(linear.parameters()) [Parameter containing: Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[0.18551230]]), Parameter containing: Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False, [0.])]
-
register_buffer
(
name,
tensor,
persistable=True
)
register_buffer¶
-
Registers a tensor as buffer into the layer.
buffer is a non-trainable tensor and will not be updated by optimizer, but is necessary for evaluation and inference. For example, the mean and variance in BatchNorm layers. The registered buffer is persistable by default, and will be saved into state_dict alongside parameters. If set persistable=False, it registers a non-persistable buffer, so that it will not be a part of state_dict .
Buffers can be accessed as attributes using given names.
- Parameters
-
name (string) – name of the buffer. The buffer can be accessed from this layer using the given name
tensor (Tensor) – the tensor to be registered as buffer.
persistable (bool) – whether the buffer is part of this layer’s state_dict.
- Returns
-
None
Examples
>>> import numpy as np >>> import paddle >>> linear = paddle.nn.Linear(10, 3) >>> value = np.array([0]).astype("float32") >>> buffer = paddle.to_tensor(value) >>> linear.register_buffer("buf_name", buffer, persistable=True) >>> # get the buffer by attribute. >>> print(linear.buf_name) Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True, [0.])
-
register_forward_post_hook
(
hook
)
register_forward_post_hook¶
-
Register a forward post-hook for Layer. The hook will be called after forward function has been computed.
It should have the following form, input and output of the hook is input and output of the Layer respectively. User can use forward post-hook to change the output of the Layer or perform information statistics tasks on the Layer.
hook(Layer, input, output) -> None or modified output
- Parameters
-
hook (function) – a function registered as a forward post-hook
- Returns
-
HookRemoveHelper, a HookRemoveHelper object that can be used to remove the added hook by calling hook_remove_helper.remove() .
Examples
>>> import paddle >>> import numpy as np >>> # the forward_post_hook change the output of the layer: output = output * 2 >>> def forward_post_hook(layer, input, output): ... # user can use layer, input and output for information statistics tasks ... ... # change the output ... return output * 2 ... >>> linear = paddle.nn.Linear(13, 5) >>> # register the hook >>> forward_post_hook_handle = linear.register_forward_post_hook(forward_post_hook) >>> value1 = np.arange(26).reshape(2, 13).astype("float32") >>> in1 = paddle.to_tensor(value1) >>> out0 = linear(in1) >>> # remove the hook >>> forward_post_hook_handle.remove() >>> out1 = linear(in1) >>> # hook change the linear's output to output * 2, so out0 is equal to out1 * 2. >>> assert (out0.numpy() == (out1.numpy()) * 2).any()
-
register_forward_pre_hook
(
hook
)
register_forward_pre_hook¶
-
Register a forward pre-hook for Layer. The hook will be called before forward function has been computed.
It should have the following form, input of the hook is input of the Layer, hook can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned(unless that value is already a tuple). User can use forward pre-hook to change the input of the Layer or perform information statistics tasks on the Layer.
hook(Layer, input) -> None or modified input
- Parameters
-
hook (function) – a function registered as a forward pre-hook
- Returns
-
HookRemoveHelper, a HookRemoveHelper object that can be used to remove the added hook by calling hook_remove_helper.remove() .
Examples
>>> import paddle >>> import numpy as np >>> # the forward_pre_hook change the input of the layer: input = input * 2 >>> def forward_pre_hook(layer, input): ... # user can use layer and input for information statistics tasks ... ... # change the input ... input_return = (input[0] * 2) ... return input_return ... >>> linear = paddle.nn.Linear(13, 5) >>> # register the hook >>> forward_pre_hook_handle = linear.register_forward_pre_hook(forward_pre_hook) >>> value0 = np.arange(26).reshape(2, 13).astype("float32") >>> in0 = paddle.to_tensor(value0) >>> out0 = linear(in0) >>> # remove the hook >>> forward_pre_hook_handle.remove() >>> value1 = value0 * 2 >>> in1 = paddle.to_tensor(value1) >>> out1 = linear(in1) >>> # hook change the linear's input to input * 2, so out0 is equal to out1. >>> assert (out0.numpy() == out1.numpy()).any()
-
set_dict
(
state_dict,
use_structured_name=True
)
set_dict¶
-
Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict
- Parameters
-
state_dict (dict) – Dict contains all the parameters and persistable buffers.
use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.
- Returns
-
A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys
- Return type
-
missing_keys(list)
Examples
>>> import paddle >>> emb = paddle.nn.Embedding(10, 10) >>> state_dict = emb.state_dict() >>> paddle.save(state_dict, "paddle_dy.pdparams") >>> para_state_dict = paddle.load("paddle_dy.pdparams") >>> emb.set_state_dict(para_state_dict)
-
set_state_dict
(
state_dict,
use_structured_name=True
)
set_state_dict¶
-
Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict
- Parameters
-
state_dict (dict) – Dict contains all the parameters and persistable buffers.
use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.
- Returns
-
A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys
- Return type
-
missing_keys(list)
Examples
>>> import paddle >>> emb = paddle.nn.Embedding(10, 10) >>> state_dict = emb.state_dict() >>> paddle.save(state_dict, "paddle_dy.pdparams") >>> para_state_dict = paddle.load("paddle_dy.pdparams") >>> emb.set_state_dict(para_state_dict)
-
state_dict
(
destination=None,
include_sublayers=True,
structured_name_prefix='',
use_hook=True,
keep_vars=True
)
state_dict¶
-
Get all parameters and persistable buffers of current layer and its sub-layers. And set them into a dict
- Parameters
-
destination (dict, optional) – If provide, all the parameters and persistable buffers will be set to this dict . Default: None.
include_sublayers (bool, optional) – If true, also include the parameters and persistable buffers from sublayers. Default: True.
use_hook (bool, optional) – If true, the operations contained in _state_dict_hooks will be appended to the destination. Default: True.
keep_vars (bool, optional) – If false, the returned tensors in the state dict are detached from autograd. Default: True.
- Returns
-
a dict contains all the parameters and persistable buffers.
- Return type
-
dict
Examples
>>> import paddle >>> emb = paddle.nn.Embedding(10, 10) >>> state_dict = emb.state_dict() >>> paddle.save( state_dict, "paddle_dy.pdparams")
-
sublayers
(
include_self=False
)
sublayers¶
-
Returns a list of sub layers.
- Parameters
-
include_self (bool, optional) – Whether return self as sublayers. Default: False.
- Returns
-
list of Layer, a list of sub layers.
Examples
>>> import paddle >>> class MyLayer(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self._linear = paddle.nn.Linear(1, 1) ... self._dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... temp = self._linear(input) ... temp = self._dropout(temp) ... return temp ... >>> mylayer = MyLayer() >>> print(mylayer.sublayers()) [Linear(in_features=1, out_features=1, dtype=float32), Dropout(p=0.5, axis=None, mode=upscale_in_train)]
-
to
(
device=None,
dtype=None,
blocking=None
)
to¶
-
Cast the parameters and buffers of Layer by the give device, dtype and blocking.
- Parameters
-
device (str|paddle.CPUPlace()|paddle.CUDAPlace()|paddle.CUDAPinnedPlace()|paddle.XPUPlace()|None, optional) – The device of the Layer which want to be stored.
None (If) –
string (the device is the same with the original Tensor. If device is) –
cpu (it can be) –
xpu:x (gpu:x and) –
the (where x is) –
Default (index of the GPUs or XPUs.) – None.
dtype (str|numpy.dtype|paddle.dtype|None, optional) – The type of the data. If None, the dtype is the same with the original Tensor. Default: None.
blocking (bool|None, optional) – If False and the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. If None, the blocking is set True. Default: None.
- Returns
-
self
Examples
>>> import paddle >>> paddle.seed(2023) >>> linear=paddle.nn.Linear(2, 2) >>> linear.weight >>> print(linear.weight) Parameter containing: Tensor(shape=[2, 2], dtype=float32, place=Place(gpu:0), stop_gradient=False, [[ 0.89611185, 0.04935038], [-0.58883440, 0.99266374]]) >>> linear.to(dtype='float64') >>> linear.weight >>> print(linear.weight) Parameter containing: Tensor(shape=[2, 2], dtype=float64, place=Place(gpu:0), stop_gradient=False, [[ 0.89611185, 0.04935038], [-0.58883440, 0.99266374]]) >>> linear.to(device='cpu') >>> linear.weight >>> print(linear.weight) Parameter containing: Tensor(shape=[2, 2], dtype=float64, place=Place(cpu), stop_gradient=False, [[ 0.89611185, 0.04935038], [-0.58883440, 0.99266374]]) >>> >>> linear.to(device=paddle.CUDAPinnedPlace(), blocking=False) >>> linear.weight >>> print(linear.weight) Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False, [[ 0.89611185, 0.04935038], [-0.58883440, 0.99266374]])
-
to_static_state_dict
(
destination=None,
include_sublayers=True,
structured_name_prefix='',
use_hook=True,
keep_vars=True
)
to_static_state_dict¶
-
Get all parameters and buffers of current layer and its sub-layers. And set them into a dict
- Parameters
-
destination (dict, optional) – If provide, all the parameters and persistable buffers will be set to this dict . Default: None.
include_sublayers (bool, optional) – If true, also include the parameters and persistable buffers from sublayers. Default: True.
use_hook (bool, optional) – If true, the operations contained in _state_dict_hooks will be appended to the destination. Default: True.
keep_vars (bool, optional) – If false, the returned tensors in the state dict are detached from autograd. Default: True.
- Returns
-
dict, a dict contains all the parameters and persistable buffers.
Examples
>>> import paddle >>> emb = paddle.nn.Embedding(10, 10) >>> state_dict = emb.to_static_state_dict() >>> paddle.save( state_dict, "paddle_dy.pdparams")
-
train
(
)
train¶
-
Sets this Layer and all its sublayers to training mode. This only effects certain modules like Dropout and BatchNorm.
- Returns
-
None
Examples
>>> import paddle >>> paddle.seed(100) >>> class MyLayer(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self._linear = paddle.nn.Linear(1, 1) ... self._dropout = paddle.nn.Dropout(p=0.5) ... ... def forward(self, input): ... temp = self._linear(input) ... temp = self._dropout(temp) ... return temp ... >>> x = paddle.randn([10, 1], 'float32') >>> mylayer = MyLayer() >>> mylayer.eval() # set mylayer._dropout to eval mode >>> out = mylayer(x) >>> mylayer.train() # set mylayer._dropout to train mode >>> out = mylayer(x) >>> print(out) Tensor(shape=[10, 1], dtype=float32, place=Place(cpu), stop_gradient=False, [[-3.44879317], [ 0. ], [ 0. ], [-0.73825276], [ 0. ], [ 0. ], [ 0.64444798], [-3.22185946], [ 0. ], [-0.68077987]])