Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Method for worker to determine its "rank" on a single machine?
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-12-10 13:25:46


There are no race conditions in this data. It is determined by mpirun prior to launch, so all procs receive the data during MPI_Init and it remains static throughout the life of the job. It isn't dynamically updated at this time (will change in later versions), so it won't tell you if a process is sitting in finalize, for example.

First, you have to configure OMPI --with-devel-headers to get access to the required functions.

If you look at the file orte/mca/ess/ess.h, you'll see functions like

orte_ess.proc_get_local_rank(orte_process_name_t *name)

You can call that function with any process name. In the ORTE world, process names are a struct of two fields: a jobid that is common to all processes in your application, and a vpid that is the MPI rank. We also have a defined var for your own name to make life a little easier.

So if you wanted to get your own local rank, you would call:

#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

my_local_rank = orte_ess.proc_get_local_rank(ORTE_PROC_MY_NAME);

To get the local rank of some other process in the job, you would call:

#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

orte_process_name_t name;

name.jobid = ORTE_PROC_MY_NAME->jobid;
name.vpid = <mpi rank of the other proc>;

his_local_rank = orte_ess.proc_get_local_rank(&name);

The node rank only differs from the local rank when a comm_spawn has been executed. If you need that capability, I can explain the difference - for now, you can ignore that function.

I don't currently provide the max number of local procs to each process or a list of local procs, but can certainly do so - nobody had a use for it before. Or you can construct those pieces of info fairly easily from data you do have. What you would do is loop over the get_proc_locality call:

#include "opal/mca/paffinity/paffinity.h"
#include "orte/types.h"
#include "orte/runtime/orte_globals.h"
#include "orte/mca/ess/ess.h"

orte_vpid_t v;
orte_process_name_t name;

name.jobid = ORTE_PROC_MY_NAME->jobid;

for (v=0; v < orte_process_info.num_procs; v++) {
        name.vpid = v;
        if (OPAL_PROC_ON_NODE & orte_ess.proc_get_locality(&name)) {
                /* the proc is on your node - do whatever with it */
        }
}

HTH
Ralph

On Dec 10, 2010, at 9:49 AM, David Mathog wrote:

>> The answer is yes - sort of...
>>
>> In OpenMPI, every process has information about not only its own local
> rank, but the local rank of all its peers regardless of what node they
> are on. We use that info internally for a variety of things.
>>
>> Now the "sort of". That info isn't exposed via an MPI API at this
> time. If that doesn't matter, then I can tell you how to get it - it's
> pretty trivial to do.
>
> Please tell me how to do this using the internal information.
>
> For now I will use that to write these functions (which might at some
> point correspond to standard functions, or not)
>
> my_MPI_Local_size(MPI_Comm comm, int *lmax, int *lactual)
> my_MPI_Local_rank(MPI_Comm comm, int *lrank)
>
> These will return N for lmax, a value M in 1->N for lactual, and a value
> in 1->M for lrank, for any worker on a machine corresponding to a
> hostfile line like:
>
> node123.cluster slots=N
>
> As usual, this could get complicated. There are probably race
> conditions on lactual vs. lrank as the workers start, but I'm guessing
> the lrank to lmax relationship won't have that problem. Similarly, the
> meaning of "local" is pretty abstract. For now all that is intended is
> "a group of equivalent cores within a single enclosure, where
> communication between them is strictly internal to the enclosure, and
> where all have equivalent access to the local disks and the network
> interface(s)". Other ways to define "local" might make more sense on
> more complex hardware.
>
> Another function that logically belongs with these is:
>
> my_MPI_Local_list(MPI_Comm comm, int *llist, int *lactual)
>
> I don't need it now, but can imagine applications that would. This
> would return the (current) lactual value and the corresponding list of
> rank numbers of all the local workers. The array llist must be of size
> lmax.
>
>
> Thanks,
>
> David Mathog
> mathog_at_[hidden]
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users