Model Parameters¶
Model parameters are weights and biases in a model. In fluid, they are instances of fluid.Parameter
class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by api_fluid_ParamAttr . The configurable contents are as follows:
Initialization method
Regularization
gradient clipping
Model Average
Initialization method¶
Fluid initializes a single parameter by setting attributes of initializer
in ParamAttr
.
examples:
param_attrs = fluid.ParamAttr(name="fc_weight", initializer=fluid.initializer.ConstantInitializer(1.0)) y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
The following is the initialization method supported by fluid:
1. BilinearInitializer¶
Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation.
Alias:Bilinear
API reference: api_fluid_initializer_BilinearInitializer
2. ConstantInitializer¶
Constant initialization. Initialize the parameter to the specified value.
Alias:Constant
API reference: api_fluid_initializer_ConstantInitializer
3. MSRAInitializer¶
Please refer to https://arxiv.org/abs/1502.01852 for initialization.
Alias:MSRA
API reference: api_fluid_initializer_MSRAInitializer
4. NormalInitializer¶
Initialization method of random Gaussian distribution.
Alias:Normal
API reference: api_fluid_initializer_NormalInitializer
5. TruncatedNormalInitializer¶
Initialization method of stochastic truncated Gauss distribution.
Alias:TruncatedNormal
API reference: api_fluid_initializer_TruncatedNormalInitializer
6. UniformInitializer¶
Initialization method of random uniform distribution.
Alias:Uniform
API reference: api_fluid_initializer_UniformInitializer
7. XavierInitializer¶
Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization.
Alias:Xavier
API reference: api_fluid_initializer_XavierInitializer
Regularization¶
Fluid regularizes a single parameter by setting attributes of regularizer
in ParamAttr
.
param_attrs = fluid.ParamAttr(name="fc_weight", regularizer=fluid.regularizer.L1DecayRegularizer(0.1)) y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
The following is the regularization approach supported by fluid:
api_fluid_regularizer_L1DecayRegularizer (Alias:L1Decay)
api_fluid_regularizer_L2DecayRegularizer (Alias:L2Decay)
Clipping¶
Fluid sets clipping method for a single parameter by setting attributes of gradient_clip
in ParamAttr
.
param_attrs = fluid.ParamAttr(name="fc_weight", regularizer=fluid.regularizer.L1DecayRegularizer(0.1)) y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
The following is the clipping method supported by fluid:
1. ErrorClipByValue¶
Used to clipping the value of a tensor to a specified range.
API reference: api_fluid_clip_ErrorClipByValue
2. GradientClipByGlobalNorm¶
Used to limit the global-norm of multiple Tensors to clip_norm
.
API reference: api_fluid_clip_GradientClipByGlobalNorm
3. GradientClipByNorm¶
Limit the L2-norm of Tensor to max_norm
. If Tensor’s L2-norm exceeds: max_norm
, it will calculate a scale
. And then all values of the Tensor multiply the scale
.
API reference: api_fluid_clip_GradientClipByNorm
4. GradientClipByValue¶
Limit the value of the gradient on a parameter to [min, max].
API reference: api_fluid_clip_GradientClipByValue
Model Averaging¶
Fluid determines whether to average a single parameter by setting attributes of do_model_average
in ParamAttr
. Examples:
param_attrs = fluid.ParamAttr(name="fc_weight", do_model_average=true) y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)
In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates.
The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process.
API reference api_fluid_optimizer_ModelAverage