Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Process ranks
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-10-20 11:59:16


Since people may not be fully familiar, and because things have evolved, I thought it might help to provide a brief explanation of the ranks we assign to processes in OMPI.

Each process has four "ranks" assigned to it at launch:

1. vpid - equivalent to its MPI rank within the job. You can access the vpid with ORTE_PROC_MY_NAME->vpid.

2. local_rank - the relative rank of the process, within its own job, on the local node. For example, if there are three processes from this job on the node, then the lowest vpid process would have local_rank=0, the next highest vpid process would have local_rank=1, etc. The local_rank is typically used by the shared memory subsystem to decide which proc will create the backing file.

Note that processes from dynamically spawned jobs on the node will have overlapping local_ranks. For example, if a process on the above job were to comm_spawn two more procs on the node, the lowest vpid of those would also have local_rank=0 as it is in a different jobid.

Every process has full knowledge of the local_rank for every other process executing within that mpirun AND for any proc that connected to it via MPI connect/accept or comm_spawn (the info is included in the modex during the connect/accept procedure). You can obtain the local_rank of any process using

orte_local_rank_t orte_ess.get_local_rank(proc_name)

This will return ORTE_LOCAL_RANK_INVALID if the info isn't known.

3. node_rank - the relative rank of the process, spanning all jobs under this mpirun, on the local node. The node_rank is typically used by the OOB to select a static port from the given range, thus ensuring that each proc on the node - regardless of job - takes a unique port. For example, if there are three processes from this job on the node, then the lowest vpid process would have node_rank=0, the next highest vpid process would have node_rank =1, etc. If a process they comm_spawns another process onto the node, it will have node_rank=3 since the computation spans -all- jobs.

Every process has full knowledge of the node_rank for every other process executing within that mpirun AND for any proc that connected to it via MPI connect/accept or comm_spawn (the info is included in the modex during the connect/accept procedure). You can obtain the node_rank of any process using

orte_node_rank_t orte_ess.get_node_rank(proc_name)

This will return ORTE_NODE_RANK_INVALID if the info isn't known.

4. app_rank - the relative rank of the process within its app_context. This equates to the vpid for a job that contains only one app_context. However, for jobs with multiple app_contexts, this value provides a way of determining a proc's rank solely within its own app_context. Each process only has access to its own app_rank in orte_process_info - it doesn't have any knowledge of the app_rank for other processes.

HTH
Ralph