SequenceParallelBegin¶
- class paddle.distributed. SequenceParallelBegin ( need_transpose: bool = True ) [source]
-
Sequence parallel plan for mp config. This plan marks the beginning of the sp and should be added to the LAST layer before the sp range.
Note
DON’T mark any layer in the sp range.
- Parameters
-
need_transpose (bool) – the default value is True. With need_transpose=True, this plan will transfer the output from [b, s, h] to [s/mp, b, h]. With need_transpose=False, this plan will transfer the output from [s, b, h] to [s/mp, b, h].
Examples
>>> import paddle >>> import paddle.distributed as dist >>> class MLP(paddle.nn.Layer): ... def __init__(self): ... super().__init__() ... self.fc1 = paddle.nn.Linear(8, 8) ... self.fc2 = paddle.nn.Linear(8, 8) ... ... def forward(self, input): ... return self.fc2(self.fc1(input)) >>> >>> layer = MLP() >>> mp_config = { ... 'fc1': dist.SequenceParallelBegin() ... }