PolynomialDecay¶
- class paddle.optimizer.lr. PolynomialDecay ( learning_rate, decay_steps, end_lr=0.0001, power=1.0, cycle=False, last_epoch=- 1, verbose=False ) [source]
-
Applies polynomial decay to the initial learning rate.
The algorithm can be described as following.
If cycle is set to True, then:
\[ \begin{align}\begin{aligned}decay\_steps & = decay\_steps * math.ceil(\frac{epoch}{decay\_steps})\\new\_learning\_rate & = (learning\_rate-end\_lr)*(1-\frac{epoch}{decay\_steps})^{power}+end\_lr\end{aligned}\end{align} \]If cycle is set to False, then:
\[ \begin{align}\begin{aligned}epoch & = min(epoch, decay\_steps)\\new\_learning\_rate & = (learning\_rate-end\_lr)*(1-\frac{epoch}{decay\_steps})^{power}+end\_lr\end{aligned}\end{align} \]- Parameters
-
learning_rate (float) – The initial learning rate. It is a python float number.
decay_steps (int) – The decay step size. It determines the decay cycle. It must be a positive integer.
end_lr (float, optional) – The minimum final learning rate. Default: 0.0001.
power (float, optional) – Power of polynomial, should greater than 0.0 to get learning rate decay. Default: 1.0.
cycle (bool, optional) – Whether the learning rate rises again. If True, then the learning rate will rise when it decrease to
end_lr
. If False, the learning rate is monotone decreasing. Default: False.last_epoch (int, optional) – The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate.
verbose (bool, optional) – If
True
, prints a message to stdout for each update. Default:False
.
- Returns
-
PolynomialDecay
instance to schedule learning rate.
Examples
>>> # Example1: train on default dynamic graph mode >>> import paddle >>> import numpy as np >>> # train on default dynamic graph mode >>> linear = paddle.nn.Linear(10, 10) >>> scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.5, decay_steps=20, verbose=True) >>> sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameters=linear.parameters()) >>> for epoch in range(20): ... for batch_id in range(5): ... x = paddle.uniform([10, 10]) ... out = linear(x) ... loss = paddle.mean(out) ... loss.backward() ... sgd.step() ... sgd.clear_gradients() ... scheduler.step() # If you update learning rate each step ... # scheduler.step() # If you update learning rate each epoch
>>> # Example2: train on static graph mode >>> import paddle >>> import numpy as np >>> paddle.enable_static() >>> main_prog = paddle.static.Program() >>> start_prog = paddle.static.Program() >>> with paddle.static.program_guard(main_prog, start_prog): ... x = paddle.static.data(name='x', shape=[None, 4, 5]) ... y = paddle.static.data(name='y', shape=[None, 4, 5]) ... z = paddle.static.nn.fc(x, 100) ... loss = paddle.mean(z) ... scheduler = paddle.optimizer.lr.PolynomialDecay(learning_rate=0.5, decay_steps=20, verbose=True) ... sgd = paddle.optimizer.SGD(learning_rate=scheduler) ... sgd.minimize(loss) ... >>> exe = paddle.static.Executor() >>> exe.run(start_prog) >>> for epoch in range(20): ... for batch_id in range(5): ... out = exe.run( ... main_prog, ... feed={ ... 'x': np.random.randn(3, 4, 5).astype('float32'), ... 'y': np.random.randn(3, 4, 5).astype('float32') ... }, ... fetch_list=loss.name) ... scheduler.step() # If you update learning rate each step ... # scheduler.step() # If you update learning rate each epoch
-
get_lr
(
)
get_lr¶
-
For those subclass who overload
LRScheduler
(Base Class), User should have a custom implementation ofget_lr()
.Otherwise, an
NotImplementedError
exception will be thrown.
-
set_dict
(
state_dict
)
set_dict¶
-
Loads the schedulers state.
-
set_state_dict
(
state_dict
)
set_state_dict¶
-
Loads the schedulers state.
-
state_dict
(
)
state_dict¶
-
Returns the state of the scheduler as a
dict
.It is a subset of
self.__dict__
.
-
state_keys
(
)
state_keys¶
-
For those subclass who overload
LRScheduler
(Base Class). Acquiescently, “last_epoch, last_lr” will be saved byself.keys = ['last_epoch', 'last_lr']
.last_epoch
is the current epoch num, andlast_lr
is the current learning rate.If you want to change the default behavior, you should have a custom implementation of
_state_keys()
to redefineself.keys
.
-
step
(
epoch=None
)
step¶
-
step
should be called afteroptimizer.step
. It will update the learning rate in optimizer according to currentepoch
. The new learning rate will take effect on nextoptimizer.step
.- Parameters
-
epoch (int, None) – specify current epoch. Default: None. Auto-increment from last_epoch=-1.
- Returns
-
None
Examples
>>> import paddle >>> value = paddle.arange(26, dtype='float32') >>> a = paddle.reshape(value, [2, 13]) >>> linear = paddle.nn.Linear(13, 5) >>> adadelta = paddle.optimizer.Adadelta(learning_rate=0.0003, epsilon=1e-06, rho=0.95, ... parameters = linear.parameters()) >>> out = linear(a) >>> out.backward() >>> adadelta.step() >>> adadelta.clear_grad()
>>> import paddle >>> value = paddle.arange(26, dtype='float32') >>> a = paddle.reshape(value, [2, 13]) >>> linear = paddle.nn.Linear(13, 5) >>> adadelta = paddle.optimizer.Adadelta(learning_rate=0.0003, epsilon=1e-06, rho=0.95, ... parameters = linear.parameters()) >>> out = linear(a) >>> out.backward() >>> adadelta.step() >>> adadelta.clear_grad()