sigmoid_focal_loss¶
- paddle.fluid.layers.detection. sigmoid_focal_loss ( x, label, fg_num, gamma=2.0, alpha=0.25 ) [source]
-
- alias_main
-
paddle.nn.functional.sigmoid_focal_loss
- alias
-
paddle.nn.functional.sigmoid_focal_loss,paddle.nn.functional.loss.sigmoid_focal_loss
- old_api
-
paddle.fluid.layers.sigmoid_focal_loss
Sigmoid Focal Loss Operator.
Focal Loss is used to address the foreground-background class imbalance existed on the training phase of many computer vision tasks. This OP computes the sigmoid value for each element in the input tensor
x
, after which focal loss is measured between the sigmoid value and target label.The focal loss is given as followed:
\[\begin{split}\\mathop{loss_{i,\\,j}}\\limits_{i\\in\\mathbb{[0,\\,N-1]},\\,j\\in\\mathbb{[0,\\,C-1]}}=\\left\\{ \\begin{array}{rcl} - \\frac{1}{fg\_num} * \\alpha * {(1 - \\sigma(x_{i,\\,j}))}^{\\gamma} * \\log(\\sigma(x_{i,\\,j})) & & {(j +1) = label_{i,\\,0}} \\\\ - \\frac{1}{fg\_num} * (1 - \\alpha) * {\sigma(x_{i,\\,j})}^{ \\gamma} * \\log(1 - \\sigma(x_{i,\\,j})) & & {(j +1)!= label_{i,\\,0}} \\end{array} \\right.\end{split}\]We know that
\[\begin{split}\\sigma(x_j) = \\frac{1}{1 + \\exp(-x_j)}\end{split}\]- Parameters
-
x (Variable) – A 2-D tensor with shape \([N, C]\) represents the predicted categories of all samples. \(N\) is the number of all samples responsible for optimization in a mini-batch, for example, samples are anchor boxes for object detection and \(N\) is the total number of positive and negative samples in a mini-batch; Samples are images for image classification and \(N\) is the number of images in a mini-batch. \(C\) is the number of classes (Notice: excluding background). The data type of
x
is float32 or float64.label (Variable) – A 2-D tensor with shape \([N, 1]\) represents the target labels for classification. \(N\) is the number of all samples responsible for optimization in a mini-batch, each sample has one target category. The values for positive samples are in the range of \([1, C]\), and the values for negative samples are 0. The data type of
label
is int32.fg_num (Variable) – A 1-D tensor with shape [1] represents the number of positive samples in a mini-batch, which should be obtained before this OP. The data type of
fg_num
is int32.gamma (int|float) – Hyper-parameter to balance the easy and hard examples. Default value is set to 2.0.
alpha (int|float) – Hyper-parameter to balance the positive and negative example. Default value is set to 0.25.
- Returns
-
A 2-D tensor with shape \([N, C]\), which is the focal loss of each element in the input tensor
x
. - Return type
-
Variable(the data type is float32 or float64)
Examples
import numpy as np import paddle.fluid as fluid num_classes = 10 # exclude background image_width = 16 image_height = 16 batch_size = 32 max_iter = 20 def gen_train_data(): x_data = np.random.uniform(0, 255, (batch_size, 3, image_height, image_width)).astype('float64') label_data = np.random.randint(0, num_classes, (batch_size, 1)).astype('int32') return {"x": x_data, "label": label_data} def get_focal_loss(pred, label, fg_num, num_classes): pred = fluid.layers.reshape(pred, [-1, num_classes]) label = fluid.layers.reshape(label, [-1, 1]) label.stop_gradient = True loss = fluid.layers.sigmoid_focal_loss( pred, label, fg_num, gamma=2.0, alpha=0.25) loss = fluid.layers.reduce_sum(loss) return loss def build_model(mode='train'): x = fluid.data(name="x", shape=[-1, 3, -1, -1], dtype='float64') output = fluid.layers.pool2d(input=x, pool_type='avg', global_pooling=True) output = fluid.layers.fc( input=output, size=num_classes, # Notice: size is set to be the number of target classes (excluding backgorund) # because sigmoid activation will be done in the sigmoid_focal_loss op. act=None) if mode == 'train': label = fluid.data(name="label", shape=[-1, 1], dtype='int32') # Obtain the fg_num needed by the sigmoid_focal_loss op: # 0 in label represents background, >=1 in label represents foreground, # find the elements in label which are greater or equal than 1, then # computed the numbers of these elements. data = fluid.layers.fill_constant(shape=[1], value=1, dtype='int32') fg_label = fluid.layers.greater_equal(label, data) fg_label = fluid.layers.cast(fg_label, dtype='int32') fg_num = fluid.layers.reduce_sum(fg_label) fg_num.stop_gradient = True avg_loss = get_focal_loss(output, label, fg_num, num_classes) return avg_loss else: # During evaluating or testing phase, # output of the final fc layer should be connected to a sigmoid layer. pred = fluid.layers.sigmoid(output) return pred loss = build_model('train') moment_optimizer = fluid.optimizer.MomentumOptimizer( learning_rate=0.001, momentum=0.9) moment_optimizer.minimize(loss) place = fluid.CPUPlace() exe = fluid.Executor(place) exe.run(fluid.default_startup_program()) for i in range(max_iter): outs = exe.run(feed=gen_train_data(), fetch_list=[loss.name]) print(outs)