SequenceParallelDisable

class paddle.distributed. SequenceParallelDisable ( need_transpose: bool = True ) [source]

Sequence parallel plan for mp config. Disable sequence parallel on the layer.

Parameters: need_transpose (bool) – the default value is True. If the need_transpose is True: this plan will transfer the input from [s/mp, b, h] to [b, s, h] and then transfer the output from [b, s, h] to [s/mp, b, h]. If the need_transpose is False: this plan will transfer the input from [s/mp, b, h] to [s, b, h] and then transfer the output from [s, b, h] to [s/mp, b, h].

Examples

>>> import paddle
>>> import paddle.distributed as dist

>>> class MLP(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self.fc1 = paddle.nn.Linear(8, 8)
...         self.fc2 = paddle.nn.Linear(8, 8)
...
...     def forward(self, input):
...         return self.fc2(self.fc1(input))

>>> 
>>> layer = MLP()
>>> mp_config = {
...     'fc1': dist.SequenceParallelDisable()
... }