ParallelEnv¶

class paddle.distributed. ParallelEnv [source]

Note

This API is not recommended, if you need to get rank and world_size, it is recommended to use paddle.distributed.get_rank() and paddle.distributed.get_world_size() .

This class is used to obtain the environment variables required for the parallel execution of paddle.nn.Layer in dynamic mode.

The parallel execution in dynamic mode needs to be started using paddle.distributed.launch or paddle.distributed.spawn .

Examples

import paddle
import paddle.distributed as dist

def train():
    # 1. initialize parallel environment
    dist.init_parallel_env()

    # 2. get current ParallelEnv
    parallel_env = dist.ParallelEnv()
    print("rank: ", parallel_env.rank)
    print("world_size: ", parallel_env.world_size)

    # print result in process 1:
    # rank: 1
    # world_size: 2
    # print result in process 2:
    # rank: 2
    # world_size: 2

if __name__ == '__main__':
    # 1. start by ``paddle.distributed.spawn`` (default)
    dist.spawn(train, nprocs=2)
    # 2. start by ``paddle.distributed.launch``
    # train()

property rank

Rank of current trainer.

Its value is equal to the value of the environment variable PADDLE_TRAINER_ID . The default value is 0.

Examples

# execute this command in terminal: export PADDLE_TRAINER_ID=0
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The rank is %d" % env.rank)
# The rank is 0

property world_size

The number of trainers (number of processes participating in current job).

Its value is equal to the value of the environment variable PADDLE_TRAINERS_NUM . The default value is 1.

Examples

# execute this command in terminal: export PADDLE_TRAINERS_NUM=4
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The world_size is %d" % env.world_size)
# The world_size is 4

property device_id

The ID of selected GPU card for parallel training.

Its value is equal to the value of the environment variable FLAGS_selected_gpus . The default value is 0.

Examples

# execute this command in terminal: export FLAGS_selected_gpus=1
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The device id are %d" % env.device_id)
# The device id are 1

property current_endpoint

The endpoint of current trainer, it is in the form of (node IP + port).

Its value is equal to the value of the environment variable PADDLE_CURRENT_ENDPOINT . The default value is “”.

Examples

# execute this command in terminal: export PADDLE_CURRENT_ENDPOINT=127.0.0.1:6170
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The current endpoint are %s" % env.current_endpoint)
# The current endpoint are 127.0.0.1:6170

property trainer_endpoints

The endpoints of all trainer nodes in the task, which are used to broadcast the NCCL ID when NCCL2 is initialized.

Its value is equal to the value of the environment variable PADDLE_TRAINER_ENDPOINTS . The default value is “”.

Examples

# execute this command in terminal: export PADDLE_TRAINER_ENDPOINTS=127.0.0.1:6170,127.0.0.1:6171
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The trainer endpoints are %s" % env.trainer_endpoints)
# The trainer endpoints are ['127.0.0.1:6170', '127.0.0.1:6171']

property nrings

Nrings of current trainer.

Its value is equal to the value of the environment variable FLAGS_nccl_nrings . The default value is 1.

Examples

# execute this command in terminal: export FLAGS_nccl_nrings=1
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The nrings is %d" % env.nrings)
# the number of ring is 1

property local_rank

Rank of current trainer.

Its value is equal to the value of the environment variable PADDLE_TRAINER_ID . The default value is 0.

Examples

# execute this command in terminal: export PADDLE_TRAINER_ID=0
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The rank is %d" % env.rank)
# The rank is 0

property nranks

The number of trainers (number of processes participating in current job).

Its value is equal to the value of the environment variable PADDLE_TRAINERS_NUM . The default value is 1.

Examples

# execute this command in terminal: export PADDLE_TRAINERS_NUM=4
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The world_size is %d" % env.world_size)
# The world_size is 4

property dev_id

The ID of selected GPU card for parallel training.

Its value is equal to the value of the environment variable FLAGS_selected_gpus . The default value is 0.

Examples

# execute this command in terminal: export FLAGS_selected_gpus=1
import paddle.distributed as dist

env = dist.ParallelEnv()
print("The device id are %d" % env.device_id)
# The device id are 1