Basic Concept¶
Program¶
Fluid
describes neural network configuration in the form of abstract grammar tree similar to that of a programming language, and the user’s description of computation will be written into a Program. Program in Fluid replaces the concept of models in traditional frameworks. It can describe any complex model through three execution structures: sequential execution, conditional selection and loop execution. Writing Program
is very close to writing a common program. If you have tried programming before, you will naturally apply your expertise to it.
In brief:
A model is a Fluid
Program
and can contain more than oneProgram
;Program
consists of nestedBlock
, and the concept ofBlock
can be analogized to a pair of braces in C++ or Java, or an indentation block in Python.Computing in
Block
is composed of three ways: sequential execution, conditional selection or loop execution, which constitutes complex computational logic.Block
contains descriptions of computation and computational objects. The description of computation is called Operator; the object of computation (or the input and output of Operator) is unified as Tensor. In Fluid, Tensor is represented by 0-leveled LoD-Tensor .
Block¶
Block
is the concept of variable scope in advanced languages. In programming languages, Block is a pair of braces, which contains local variable definitions and a series of instructions or operators. Control flow structures if-else
and for
in programming languages can be equivalent to the following counterparts in deep learning:
programming languages |
Fluid |
---|---|
for, while loop |
RNN,WhileOP |
if-else, switch |
IfElseOp, SwitchOp |
execute sequentially |
a series of layers |
As mentioned above, Block
in Fluid describes a set of Operators that include sequential execution, conditional selection or loop execution, and the operating object of Operator: Tensor.
Operator¶
In Fluid, all operations of data are represented by Operator
. In Python, Operator
in Fluid is encapsulated into modules like paddle.fluid.layers
, paddle.fluid.nets
.
This is because some common operations on Tensor may consist of more basic operations. For simplicity, some encapsulation of the basic Operator is carried out inside the framework, including the creation of learnable parameters relied by an Operator, the initialization details of learnable parameters, and so on, so as to reduce the cost of further development.
More information can be read for reference. Fluid Design Idea
Variable¶
In Fluid, Variable
can contain any type of value – in most cases a LoD-Tensor.
All the learnable parameters in the model are kept in the memory space in form of Variable
. In most cases, you do not need to create the learnable parameters in the network by yourself. Fluid provides encapsulation for almost common basic computing modules of the neural network. Taking the simplest full connection model as an example, calling fluid.layers.fc
directly creates two learnable parameters for the full connection layer, namely, connection weight (W) and bias, without explicitly calling Variable
related interfaces to create learnable parameters.
Name¶
In Fluid, some layers contain the parameter name
, such as api_fluid_layers_fc . This name
is generally used as the prefix identification of output and weight in network layers. The specific rules are as follows:
Prefix identification for output of layers. If
name
is specified in the layer, Fluid will name the output withnameValue.tmp_number
. If thename
is not specified,OPName_number.tmp_number
is automatically generated to name the layer. The numbers are automatically incremented to distinguish different network layers under the same operator.Prefix identification for weight or bias variable. If the weight and bias variables are created by
param_attr
andbias_attr
in operator, such as api_fluid_layers_embedding 、 api_fluid_layers_fc , Fluid will generateprefix.w_number
orprefix.b_number
as unique identifier to name them, where theprefix
isname
specified by users orOPName_number
generated by default. Ifname
is specified inparam_attr
andbias_attr
, thename
is no longer generated automatically. Refer to the sample code for details.
In addition, the weights of multiple network layers can be shared by specifying the name
parameter in api_fluid_ParamAttr.
Sample Code:
import paddle.fluid as fluid
import numpy as np
x = fluid.layers.data(name='x', shape=[1], dtype='int64', lod_level=1)
emb = fluid.layers.embedding(input=x, size=(128, 100)) # embedding_0.w_0
emb = fluid.layers.Print(emb) # Tensor[embedding_0.tmp_0]
# default name
fc_none = fluid.layers.fc(input=emb, size=1) # fc_0.w_0, fc_0.b_0
fc_none = fluid.layers.Print(fc_none) # Tensor[fc_0.tmp_1]
fc_none1 = fluid.layers.fc(input=emb, size=1) # fc_1.w_0, fc_1.b_0
fc_none1 = fluid.layers.Print(fc_none1) # Tensor[fc_1.tmp_1]
# name in ParamAttr
w_param_attrs = fluid.ParamAttr(name="fc_weight", learning_rate=0.5, trainable=True)
print(w_param_attrs.name) # fc_weight
# name == 'my_fc'
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_0
my_fc1 = fluid.layers.Print(my_fc1) # Tensor[my_fc.tmp_1]
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_1
my_fc2 = fluid.layers.Print(my_fc2) # Tensor[my_fc.tmp_3]
place = fluid.CPUPlace()
x_data = np.array([[1],[2],[3]]).astype("int64")
x_lodTensor = fluid.create_lod_tensor(x_data, [[1, 2]], place)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
ret = exe.run(feed={'x': x_lodTensor}, fetch_list=[fc_none, fc_none1, my_fc1, my_fc2], return_numpy=False)
In the above example, fc_none
and fc_none1
are not specified name
parameter, so this two layers are named with fc_0.tmp_1
and fc_1.tmp_1
in the form OPName_number.tmp_number
, where the numbers in fc_0
and fc_1
are automatically incremented to distinguish this two fully connected layers. The other two fully connected layers my_fc1
and my_fc2
both specify the name
parameter with same values. Fluid will distinguish the two layers by suffix tmp_number
. That is my_fc.tmp_1
and my_fc.tmp_3
.
Variables created in emb
layer and fc_none
, fc_none1
are named by the OPName_number
, such as embedding_0.w_0
、 fc_0.w_0
、 fc_0.b_0
. And the prefix is consistent with the prefix of network layer. The my_fc1
layer and my_fc2
layer preferentially name the shared weight with fc_weight
specified in ParamAttr
. The bias variables my_fc.b_0
and my_fc.b_1
are identified suboptimally with name
int the operator as prefix.
In the above example, the my_fc1
and my_fc2
two fully connected layers implement the sharing of weight parameters by constructing ParamAttr
and specifying the name
parameter.