ReduceLROnPlateau¶

class paddle.fluid.dygraph.learning_rate_scheduler. ReduceLROnPlateau ( learning_rate, mode='min', decay_rate=0.1, patience=10, verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08, dtype='float32' ) [source]

Api_attr: imperative

Reduce learning rate when loss has stopped descending. Models often benefit from reducing the learning rate by 2 to 10 times once model performance has no longer improvement.

The loss is the one which has been pass into step , it must be 1-D Tensor with shape [1]. When loss stop descending for a patience number of epochs, the learning rate will be reduced to learning_rate * decay_rate . (Specially, mode can also be set to 'max , in this case, when loss stop ascending for a patience number of epochs, the learning rate will be reduced.)

In addition, After each reduction, it will wait a cooldown number of epochs before resuming normal operation.

Parameters

learning_rate (Variable|float|int) – The initial learning rate. It can be set to python float or int number. If the type is Variable, it should be 1-D Tensor with shape [1], the data type can be ‘float32’ or ‘float64’.
mode (str, optional) – 'min' or 'max' can be selected. Normally, it is 'min' , which means that the learning rate will reduce when loss stops descending. Specially, if it’s set to 'max' , the learning rate will reduce when loss stops ascending. Default: 'min' .
decay_rate (float, optional) – The Ratio that the learning rate will be reduced. new_lr = origin_lr * decay_rate . It should be less than 1.0. Default: 0.1.
patience (int, optional) – When loss doesn’t improve for this number of epochs, learing rate will be reduced. Default: 10.
verbose (bool, optional) – If True, prints a message to stdout for each update. Default: False.
threshold (float, optional) – threshold and threshold_mode will determine the minimum change of loss . This make tiny changes of loss will be ignored. Default: 1e-4.
threshold_mode (str, optional) – 'rel' or 'abs' can be selected. In 'rel' mode, the minimum change of loss is last_loss * threshold , where last_loss is loss in last epoch. In 'abs' mode, the minimum change of loss is threshold . Default: 'rel' .
cooldown (int, optional) – The number of epochs to wait before resuming normal operation. Default: 0.
min_lr (float, optional) – The lower bound of the learning rate after reduction. Default: 0.
eps (float, optional) – Minimal decay applied to lr. If the difference between new and old lr is smaller than eps, the update is ignored. Default: 1e-8.
dtype (str, optional) – The data type used to create the learning rate variable. The data type can be set as ‘float32’, ‘float64’. Default: ‘float32’.

Returns

Reduced learning rate.

Examples:

           import paddle.fluid as fluid
import numpy as np

with fluid.dygraph.guard():
    x = np.random.uniform(-1, 1, [10, 10]).astype("float32")
    linear = fluid.dygraph.Linear(10, 10)
    input = fluid.dygraph.to_variable(x)

    reduce_lr = fluid.dygraph.ReduceLROnPlateau(
                            learning_rate = 1.0,
                            decay_rate = 0.5,
                            patience = 5,
                            verbose = True,
                            cooldown = 3)
    adam = fluid.optimizer.Adam(
        learning_rate = reduce_lr,
        parameter_list = linear.parameters())

    for epoch in range(10):
        total_loss = 0
        for bath_id in range(5):
            out = linear(input)
            loss = fluid.layers.reduce_mean(out)
            total_loss += loss
            adam.minimize(loss)

        avg_loss = total_loss/5

        # adjust learning rate according to avg_loss
        reduce_lr.step(avg_loss)
        lr = adam.current_step_lr()
        print("current avg_loss is %s, current lr is %s" % (avg_loss.numpy()[0], lr))

          

step ( loss ) step¶

It should be invoked on each epoch. Update the learning rate in optimizer according to loss . The new learning rate will take effect on next call to optimizer.minimize .

Parameters: loss (Variable) – A Variable that will be monitored to determine whether the learning rate will reduce. If it stop descending for a patience number of epochs, the learning rate will reduce. It should be 1-D Tensor with shape [1]. Specially, if mode has been set to 'max' , the learning rate will reduce when it stops ascending.
Returns: None

Examples

Please refer to the example of current LearningRateDecay.

create_lr_var ( lr ) create_lr_var¶

convert lr from float to variable

Parameters: lr – learning rate
Returns: learning rate variable

set_dict ( state_dict ) set_dict¶: Loads the schedulers state.

set_state_dict ( state_dict ) set_state_dict¶: Loads the schedulers state.

state_dict ( ) state_dict¶

Returns the state of the scheduler as a dict.

It is a subset of self.__dict__ .

ReduceLROnPlateau¶

step¶

create_lr_var¶

set_dict¶

set_state_dict¶

state_dict¶