ReduceOnPlateau¶
- class paddle.optimizer.lr. ReduceOnPlateau ( learning_rate, mode='min', factor=0.1, patience=10, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, epsilon=1e-08, verbose=False ) [source]
-
Reduce learning rate when
metrics
has stopped descending. Models often benefit from reducing the learning rate by 2 to 10 times once model performance has no longer improvement.The
metrics
is the one which has been pass intostep
, it’s shape must [] or [1]. Whenmetrics
stop descending for apatience
number of epochs, the learning rate will be reduced tolearning_rate * factor
. (Specially,mode
can also be set to'max
, in this case, whenmetrics
stop ascending for apatience
number of epochs, the learning rate will be reduced.)In addition, After each reduction, it will wait a
cooldown
number of epochs before resuming above operation.- Parameters
-
learning_rate (float) – The initial learning rate. It is a python float number.
mode (str, optional) –
'min'
or'max'
can be selected. Normally, it is'min'
, which means that the learning rate will reduce whenloss
stops descending. Specially, if it’s set to'max'
, the learning rate will reduce whenloss
stops ascending. Default:'min'
.factor (float, optional) – The Ratio that the learning rate will be reduced.
new_lr = origin_lr * factor
. It should be less than 1.0. Default: 0.1.patience (int, optional) – When
loss
doesn’t improve for this number of epochs, learing rate will be reduced. Default: 10.threshold (float, optional) –
threshold
andthreshold_mode
will determine the minimum change ofloss
. This make tiny changes ofloss
will be ignored. Default: 1e-4.threshold_mode (str, optional) –
'rel'
or'abs'
can be selected. In'rel'
mode, the minimum change ofloss
islast_loss * threshold
, wherelast_loss
isloss
in last epoch. In'abs'
mode, the minimum change ofloss
isthreshold
. Default:'rel'
.cooldown (int, optional) – The number of epochs to wait before resuming normal operation. Default: 0.
min_lr (float, optional) – The lower bound of the learning rate after reduction. Default: 0.
epsilon (float, optional) – Minimal decay applied to lr. If the difference between new and old lr is smaller than epsilon, the update is ignored. Default: 1e-8.
verbose (bool, optional) – If
True
, prints a message to stdout for each update. Default:False
.
- Returns
-
ReduceOnPlateau
instance to schedule learning rate.
Examples
>>> # Example1: train on default dynamic graph mode >>> import paddle >>> import numpy as np >>> # train on default dynamic graph mode >>> linear = paddle.nn.Linear(10, 10) >>> scheduler = paddle.optimizer.lr.ReduceOnPlateau(learning_rate=1.0, factor=0.5, patience=5, verbose=True) >>> sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameters=linear.parameters()) >>> for epoch in range(20): ... for batch_id in range(5): ... x = paddle.uniform([10, 10]) ... out = linear(x) ... loss = paddle.mean(out) ... loss.backward() ... sgd.step() ... sgd.clear_gradients() ... scheduler.step(loss) # If you update learning rate each step ... # scheduler.step(loss) # If you update learning rate each epoch
>>> # Example2: train on static graph mode >>> import paddle >>> import numpy as np >>> paddle.enable_static() >>> main_prog = paddle.static.Program() >>> start_prog = paddle.static.Program() >>> with paddle.static.program_guard(main_prog, start_prog): ... x = paddle.static.data(name='x', shape=[None, 4, 5]) ... y = paddle.static.data(name='y', shape=[None, 4, 5]) ... z = paddle.static.nn.fc(x, 100) ... loss = paddle.mean(z) ... scheduler = paddle.optimizer.lr.ReduceOnPlateau(learning_rate=1.0, factor=0.5, patience=5, verbose=True) ... sgd = paddle.optimizer.SGD(learning_rate=scheduler) ... sgd.minimize(loss) ... >>> exe = paddle.static.Executor() >>> exe.run(start_prog) >>> for epoch in range(20): ... for batch_id in range(5): ... out = exe.run( ... main_prog, ... feed={ ... 'x': np.random.randn(3, 4, 5).astype('float32'), ... 'y': np.random.randn(3, 4, 5).astype('float32') ... }, ... fetch_list=loss.name) ... scheduler.step(out[0]) # If you update learning rate each step ... # scheduler.step(out[0]) # If you update learning rate each epoch ...
-
state_keys
(
)
state_keys¶
-
For those subclass who overload
LRScheduler
(Base Class). Acquiescently, “last_epoch, last_lr” will be saved byself.keys = ['last_epoch', 'last_lr']
.last_epoch
is the current epoch num, andlast_lr
is the current learning rate.If you want to change the default behavior, you should have a custom implementation of
_state_keys()
to redefineself.keys
.
-
step
(
metrics,
epoch=None
)
step¶
-
step should be called after optimizer.step() . It will update the learning rate in optimizer according to
metrics
. The new learning rate will take effect on next epoch.- Parameters
-
metrics (Tensor|numpy.ndarray|float) – Which will be monitored to determine whether the learning rate will reduce. If it stop descending for a
patience
number of epochs, the learning rate will reduce. If it’s ‘Tensor’ or ‘numpy.ndarray’, its numel must be 1.epoch (int, None) – specify current epoch. Default: None. Auto-increment from last_epoch=-1.
- Returns
-
None
Examples
Please refer to the example of current LRScheduler.
-
get_lr
(
)
get_lr¶
-
For those subclass who overload
LRScheduler
(Base Class), User should have a custom implementation ofget_lr()
.Otherwise, an
NotImplementedError
exception will be thrown.
-
set_dict
(
state_dict
)
set_dict¶
-
Loads the schedulers state.
-
set_state_dict
(
state_dict
)
set_state_dict¶
-
Loads the schedulers state.
-
state_dict
(
)
state_dict¶
-
Returns the state of the scheduler as a
dict
.It is a subset of
self.__dict__
.