Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] Process ranks
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-10-20 11:59:16


Since people may not be fully familiar, and because things have evolved, I thought it might help to provide a brief explanation of the ranks we assign to processes in OMPI.

Each process has four "ranks" assigned to it at launch:

1. vpid - equivalent to its MPI rank within the job. You can access the vpid with ORTE_PROC_MY_NAME->vpid.

2. local_rank - the relative rank of the process, within its own job, on the local node. For example, if there are three processes from this job on the node, then the lowest vpid process would have local_rank=0, the next highest vpid process would have local_rank=1, etc. The local_rank is typically used by the shared memory subsystem to decide which proc will create the backing file.

Note that processes from dynamically spawned jobs on the node will have overlapping local_ranks. For example, if a process on the above job were to comm_spawn two more procs on the node, the lowest vpid of those would also have local_rank=0 as it is in a different jobid.

Every process has full knowledge of the local_rank for every other process executing within that mpirun AND for any proc that connected to it via MPI connect/accept or comm_spawn (the info is included in the modex during the connect/accept procedure). You can obtain the local_rank of any process using

orte_local_rank_t orte_ess.get_local_rank(proc_name)

This will return ORTE_LOCAL_RANK_INVALID if the info isn't known.

3. node_rank - the relative rank of the process, spanning all jobs under this mpirun, on the local node. The node_rank is typically used by the OOB to select a static port from the given range, thus ensuring that each proc on the node - regardless of job - takes a unique port. For example, if there are three processes from this job on the node, then the lowest vpid process would have node_rank=0, the next highest vpid process would have node_rank =1, etc. If a process they comm_spawns another process onto the node, it will have node_rank=3 since the computation spans -all- jobs.

Every process has full knowledge of the node_rank for every other process executing within that mpirun AND for any proc that connected to it via MPI connect/accept or comm_spawn (the info is included in the modex during the connect/accept procedure). You can obtain the node_rank of any process using

orte_node_rank_t orte_ess.get_node_rank(proc_name)

This will return ORTE_NODE_RANK_INVALID if the info isn't known.

4. app_rank - the relative rank of the process within its app_context. This equates to the vpid for a job that contains only one app_context. However, for jobs with multiple app_contexts, this value provides a way of determining a proc's rank solely within its own app_context. Each process only has access to its own app_rank in orte_process_info - it doesn't have any knowledge of the app_rank for other processes.

HTH
Ralph